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Single Nucleotide Polymorphisms Sensitively Predicting 
Adverse Drug Reactions (ADR) and Drug Efficacy 



Technical Field 

[0001] This invention relates generally to genetic polymorphisms useful for assessing 
cardiovascular risks in humans, including, but not limited to, atherosclerosis, ischemia/reperfusion, 
hypertension, restenosis, arterial inflammation, myocardial infarction, and stroke. In addition it 
relates to genetic polymorphisms useful for assessing the response to lipid lowering drug therapy. 
More specifically, the present invention identifies and describes gene variations which are 
individually present in humans with cardiovascular disease states, rela to humans with normal, or 
non-cardiovascular disease states, and/or in response to medications relevant to cardiovascular 
disease. Further, the present invention provides methods for the identification and therapeutic use of 
compounds as treatments of cardiovascular disease. Moreover, the present invention provides 
methods for the diagnostic monitoring of patients undergoing clinical evaluation for the treatment of 
cardiovascular disease, and for monitoring the efficacy of compounds in clinical trials. Still further, 
the present invention provides methods to use gene variations to predict personal medication schemes 
omitting adverse drug reactions and allowing an adjustment of the drug dose to achieve maximum 
benefit for the patient. Additionally, the present invention describes methods for the diagnostic 
evaluation and prognosis of various cardiovascular diseases, and for the identification of subjects 
exhibiting a predisposition to such conditions. 

Background of the Invention 
[0002] Cardiovascular disease is a major health risk throughout the industrialized world. 
[0003] Cardiovascular diseases include but are not limited by the following disorders of the heart 
and the vascular system: congestive heart failure, myocardial infarction, atherosclerosis, ischemic 
diseases of the heart, coronary heart disease, all kinds of atrial and ventricular arrhythmias, 
hypertensive vascular diseases and peripheral vascular diseases. 

[0004] Heart failure is defined as a pathophysiologic state in which an abnormality of cardiac 
function is responsible for the failure of the heart to pump blood at a rate commensurate with the 
requirement of the metabolizing tissue. It includes all forms of pumping failure such as high-output 
and low-output, acute and chronic, right-sided or left-sided, systolic or diastolic, independent of the 
underlying cause. 
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[0005] Myocardial infarction (MI) is generally caused by an abrupt decrease in coronary blood 
flow that follows a thrombotic occlusion of a coronary artery previously narrowed by arteriosclerosis. 
MI prophylaxis (primary and secondary prevention) is included as well as the acute treatment of MI 
and the prevention of complications. 

[0006] Ischemic diseases are conditions in which the coronary flow is restricted resulting in an 
perfusion which is inadequate to meet the myocardial requirement for oxygen. This group of diseases 
include stable angina, unstable angina and asymptomatic ischemia. 
[0007] Arrhythmias include all forms of atrial and ventricular tachyarrhythmias (atrial 
tachycardia, atrial flutter, atrial fibrillation, atrio-ventricular reentrant tachycardia, preexitation 
syndrome, ventricular tachycardia, ventricular flutter, ventricular fibrillation) as well as bradycardic 
forms of arrhythmias. 

[0008] Hypertensive vascular diseases include primary as well as all kinds of secondary arterial 
hypertension (renal, endocrine, neurogenic, others). 

[0009] Peripheral vascular diseases are defined as vascular diseases in which arterial and/or venous 
flow is reduced resulting in an imbalance between blood supply and tissue oxygen demand. It 
includes chronic peripheral arterial occlusive disease (PAOD), acute arterial thrombosis and 
embolism, inflammatory vascular disorders, Raynaud's phenomenon and venous disorders. 
[0010] Atherosclerosis, the most prevalent of vascular diseases, is the principal cause of heart 
attack, stroke, and gangrene of the extremities, and thereby the principal cause of death. 
Atherosclerosis is a complex disease involving many cell types and molecular factors (for a detailed 
review, see Ross, NATURE 362:801-809 (1993 )and Lusis, A.J., NATURE 407:233-241 (2000)). The 
process, in normal circumstances a protective response to insults to the endothelium and smooth 
muscle cells (SMCs) of the wall of the artery, consists of the formation of fibrofatty and fibrous 
lesions or plaques, preceded and accompanied by inflammation. The advanced lesions of 
atherosclerosis may occlude the artery concerned, and result from an excessive inflammatory- 
fibroproliferative response to numerous different forms of insult. For example, shear stresses are 
thought to be responsible for the frequent occurrence of atherosclerotic plaques in regions of the 
circulatory system where turbulent blood flow occurs, such as branch points and irregular structures. 
[0011] The first observable event in the formation of an atherosclerotic plaque occurs when blood- 
borne monocytes adhere to the vascular endothelial layer and transmigrate through to the sub- 
endothelial space. Adjacent endothelial cells at the same time produce oxidized low density 
lipoprotein (LDL). These oxidized LDLs are then taken up in large amounts by the monocytes 
through scavenger receptors expressed on their surfaces. In contrast to the regulated pathway by 
which native LDL (nLDL) is taken up by nLDL specific receptors, the scavenger pathway of uptake 
is not regulated by the monocytes. 
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[0012] These lipid-filled monocytes are called foam cells, and are the major constituent of the fatty 
streak. Interactions between foam cells and the endothelial and SMCs which surround them lead to a 
state of chronic local inflammation which can eventually lead to smooth muscle cell proliferation and 
migration, and the formation of a fibrous plaque. Such plaques occlude the blood vessel concerned 
and thus restrict the flow of blood, resulting in ischemia. 

[0013] Ischemia is a condition characterized by a lack of oxygen supply in tissues of organs due to 
inadequate perfusion. Such inadequate perfusion can have number of natural causes, including 
atherosclerotic or restenotic lesions, anemia, or stroke, to name a few. Many medical interventions, 
such as the interruption of the flow of blood during bypass surgery, for example, also lead to 
ischemia. In addition to sometimes being caused by diseased cardiovascular tissue, ischemia may 
sometimes affect cardiovascular tissue, such as in ischemic heart disease. Ischemia may occur in any 
organ, however, that is suffering a lack of oxygen supply. 

[0014] The most common cause of ischemia in the heart is atherosclerotic disease of epicardial 
coronary arteries. By reducing the lumen of these vessels, atherosclerosis causes an absolute decrease 
in myocardial perfusion in the basal state or limits appropriate increases in perfusion when the 
demand for flow is augmented. Coronary blood flow can also be limited by arterial thrombi, spasm, 
and, rarely, coronary emboli, as well as by ostial narrowing due to luetic aortitis. Congenital 
abnormalities, such as anomalous origin of the left anterior descending coronary artery from the 
pulmonary artery, may cause myocardial ischemia and infarction in infancy, but this cause is very 
rare in adults. Myocardial ischemia can also occur if myocardial oxygen demands are abnormally 
increased, as in severe ventricular hypertrophy due to hypertension or aortic stenosis. The latter can 
be present with angina that is indistinguishable from that caused by coronary atherosclerosis. A, 
reduction in the oxygen-carrying capacity of the blood, as in extremely severe anemia or in the 
presence of carboxy-hemoglobin, is a rare cause of myocardial ischemia. Not infrequently, two or 
more causes of ischemia will coexist, such as an increase in oxygen demand due to left ventricular 
hypertrophy and a reduction in oxygen supply secondary to coronary atherosclerosis. 
[0015] The foregoing studies are aimed at defining the role of particular gene variations presumed 
to be involved in the misleading of normal cellular function leading to cardiovascular disease. 
However, such approaches cannot identify the full panoply of gene variations that are involved in the 
disease process. 

[0016] At present, the only available treatments for cardiovascular disorders are pharmaceutical 
based medications that are not targeted to an individual's actual defect; examples include angiotensin 
converting enzyme (ACE) inhibitors and diuretics for hypertension, insulin supplementation for non- 
insulin dependent diabetes mellitus (NIDDM), cholesterol reduction strategies for dyslipidaemia, 
anticoagulants, P blockers for cardiovascular disorders and weight reduction strategies for obesity. If 
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targeted treatment strategies were available it might be possible to predict the response to a particular 
regime of therapy and could markedly increase the effectiveness of such treatment. Although targeted 
therapy requires accurate diagnostic tests for disease susceptibility, once these tests are developed the 
opportunity to utilize targeted therapy will become widespread. Such diagnostic tests could initially 
serve to identify individuals at most risk of hypertension and could allow them to make changes in 
lifestyle or diet that would serve as preventative measures. The benefits associated by coupling the 
diagnostic tests with a system of targeted therapy could include the reduction in dosage of 
administered drugs and thus the amount of unpleasant side effects suffered by an individual. In more 
severe cases a diagnostic test may suggest that earlier surgical intervention would be useful in 
preventing a further deterioration in condition. 

[0017] It is an object of the invention to provide genetic diagnosis of predisposition or 
susceptibility for cardiovascular diseases. Another related object is to provide treatment to reduce or 
prevent or delay the onset of disease in those predisposed or susceptible to this disease. A further 
object is to provide means for carrying out this diagnosis. 

[0018] Accordingly, a first aspect of the invention provides a method of diagnosis of disease in an 
individual, said method comprising determining one, various or all genotypes in said individual of the 
genes listed in the Examples. 

[0019] In another aspect, the invention provides a method of identifying an individual predisposed 
or susceptible to a disease, said method comprising determining one, various or all genotypes in said 
individual of the genes listed in the Examples. 

[0020] The invention is of advantage in that it enables diagnosis of a disease or of certain disease 
states via genetic analysis which can yield useable results before onset of disease symptoms, or 
before onset of severe symptoms. The invention is further of advantage in that it enables diagnosis of 
predisposition or susceptibility to a disease or of certain disease states via genetic analysis. 
[0021] The invention may also be of use in confirming or corroborating the results of other 
diagnostic methods. The diagnosis of the invention may thus suitably be used either as an isolated 
technique or in combination with other methods and apparatus for diagnosis, in which latter case the 
invention provides a further test on which a diagnosis may be assessed. 

[0022] The present invention stems from using allelic association as a method for genotyping 
individuals; allowing the investigation of the molecular genetic basis for cardiovascular diseases. In a 
specific embodiment the invention tests for the polymorphisms in the sequences of the listed genes in 
the Examples. The invention demonstrates a link between this polymorphisms and predispositions to 
cardiovascular diseases by showing that allele frequencies significantly differ when individuals with 
"bad" serum lipids are compared to individuals with "good" serum levels. The meaning of "good and 
bad" serum lipid levels is defined in Table la. 
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[0023] The PROCAM algorithm defines also a risk assessment based on lipids (LDL-cholesterol, 
HDL-cholesterol, triglycerides) and risk factors like smoking, high blood pressure or diabetes 
mellitus (Assmann et ai., AM J Cardiol 77:1 179-1 184 (1996)). 

[0024] Certain disease states would benefit, that is to say the suffering of the patient may be 
reduced or prevented or delayed, by administration of treatment or therapy in advance of disease 
appearance; this can be more reliably carried out if advance diagnosis of predisposition or 
susceptibility to disease can be diagnosed. 

[0025] Adverse drug reactions (ADRs) remain a major clinical problem. A recent meta-analysis 
suggested that in the USA in 1994, ADRs were responsible for 100 000 deaths, making them between 
the fourth and sixth commonest cause of death (Lazarou, J., Am. MED. ASSOC. 279:1200 (1998)). 
Although these figures have been heavily criticized, they emphasize the importance of ADRs. Indeed, 
there is good evidence that ADRs account for 5% of all hospital admissions and increase the length 
of stay in hospital by two days at an increased cost of -$2500 per patient. ADRs are also one of the 
commonest causes of drug withdrawal, which has enormous financial implications for the 
pharmaceutical industry. ADRs, perhaps fortunately, only affect a minority of those taking a 
particular drug. Although factors that determine susceptibility are unclear in most cases, there is 
increasing interest in the role of genetic factors. Indeed, the role of inheritable variations in 
predisposing patients to ADRs has been appreciated since the late 1950s and early 1960s through the 
discovery of deficiencies in enzymes such as pseudocholinesterase (butyrylcholinesterase) and 
glucose-6-phosphate dehydrogenase (G6PD). More recently, with the first draft of the human genome 
just completed, there has been renewed interest in this area with the introduction of terms such as 
pharmacogenomics and toxicogenomics. Essentially, the aim of pharmacogenomics is to produce 
personalized medicines, whereby administration of the drug class and dosage is tailored to an 
individual genotype. Thus, the term pharmacogenomics embraces both efficacy and toxicity. 
[0026] The 3-hydroxy-3-methylglutaryl coenzyme A (HMG-CoA) reductase inhibitors ("statins") 
specifically inhibit the enzyme HMG-CoA reductase which catalyzes the rate limiting step in 
cholesterol biosynthesis. These drugs are effective in reducing the primary and secondary risk of 
coronary artery disease and coronary events, such as heart attack, in middle-aged and older men and 
women, in both diabetic and non-diabetic patients, and are often prescribed for patients with 
hyperlipidemia. Statins used in secondary prevention of coronary artery or heart disease significantly 
reduce the risk of stroke, total mortality and morbidity and attacks of myocardial ischemia; the use of 
statins is also associated with improvements in endothelial and fibrinolytic functions and decreased 
platelet thrombus formation. 

[0027] The tolerability of these drugs during long term administration is an important issue. 
Adverse reactions involving skeletal muscle are not uncommon, and sometimes serious adverse 
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reactions involving skeletal muscle such as myopathy and rhabdomyolysis may occur, requiring 
discontinuation of the drug. In addition an increase in serum creatine kinase (CK) may be a sign of a 
statin related adverse event. The extend of such adverse events can be read from the extend of the CK 
level increase (as compared to the upper limit of normal [ULN]). 

[0028] Occasionally arthralgia, alone or in association with myalgia, has been reported. Also an 

elevation of liver transaminases has been associated with statin administration. 

[0029] It was shown that the drug response to statin therapy is a class effects, i.e. all known and 

presumably also all so far undiscovered statins share the same benefical and harmful effects (Ucar, 

M. et al. } Drug Safety 22:441 (2000)). It follows that the discovery of diagnostic tools to predict the 

drug response to a single statin will also be of aid to guide therapy with other statins. 

[0030] The present invention provides diagnostic tests to predict the patient's individual response 

to statin therapy. Such responses include, but are not limited by the extent of adverse drug reactions, 

the level of lipid lowering or the drug's influence on disease states. Those diagnostic tests may predict 

the response to statin therapy either alone or in combination with another diagnostic test or another 

drug regimen. 



Summary of the Invention 
[0031] The present invention provides diagnostic methods for assessing cardiovascular status in a 
human individual. Cardiovascular status as used herein refers to the physiological status of an 
individual's cardiovascular system as reflected in one or more markers or indicators. Status markers 
include without limitation clinical measurements such as, e.g., blood pressure, electrocardiographic 
profile, and differentiated blood flow analysis as well as measurements of LDL- and HDL- 
Cholesterol levels, other lipids and other well established clinical parameters that are standard in the 
art. Status markers according to the invention include diagnoses of one or more cardiovascular 
syndromes, such as, e.g., hypertension, acute myocardial infarction, silent myocardial infarction, 
stroke, and atherosclerosis. It will be understood that a diagnosis of a cardiovascular syndrome made 
by a medical practitioner encompasses clinical measurements and medical judgement. Status markers 
according to the invention are assessed using conventional methods well known in the art. Also 
included in the evaluation of cardiovascular status are quantitative or qualitative changes in status 
markers with time, such as would be used, e.g., in the determination of an individual's response to a 
particular therapeutic regimen. 

[0032] The methods are carried out by the steps of: (i) determining the sequence of one or more 
polymorphic positions within one, several or all of the genes listed in Examples or other genes 
mentioned in this file in the individual to establish a polymorphic pattern for the individual; and (ii) 
comparing the polymorphic pattern established in (i) with the polymorphic patterns of humans 
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exhibiting different markers of cardiovascular status. The polymorphic pattern of the individual is, 
preferably, highly similar and, most preferably, identical to the polymorphic pattern of individuals 
who exhibit particular status markers, cardiovascular syndromes, and/or particular patterns of 
response to therapeutic interventions. Polymorphic patterns may also include polymorphic positions 
in other genes which are shown, in combination with one or more polymorphic positions in the genes 
listed in the Examples, to correlate with the presence of particular status markers. In one 
embodiment, the method involves comparing an individual's polymorphic pattern with polymorphic 
patterns of individuals who have been shown to respond positively or negatively to a particular 
therapeutic regimen. Therapeutic regimen as used herein refers to treatments aimed at the elimination 
or amelioration of symptoms and events associated cardiovascular disease. Such treatments include 
without limitation one or more of alteration in diet, lifestyle, and exercise regimen; invasive and 
noninvasive surgical techniques such as atherectomy, angioplasty, and coronary bypass surgery; and 
pharmaceutical interventions, such as administration of ACE inhibitors, angiotensin II receptor 
antagonists, diuretics, alpha-adrenoreceptor antagonists, cardiac glycosides, phosphodiesterase 
inhibitors, beta-adrenoreceptor antagonists, calcium channel blockers, HMG-CoA reductase 
inhibitors, imidazoline receptor blockers, endothelin receptor blockers, organic nitrites, and 
modulators of protein function of genes listed in the Examples. Interventions with pharmaceutical 
agents not yet known whose activity correlates with particular polymorphic patterns associated with 
cardiovascular disease are also encompassed. It is contemplated, for example, that patients who are 
candidates for a particular therapeutic regimen will be screened for polymorphic patterns that 
correlate with responsivity to that particular regimen. 

[0033] The present invention provides methods for determining the molecular structure of at least 
one polymorphic region of a gene, specific allelic variants of said polymorphic region being 
associated with cardiovascular disease. In one embodiment, determining the molecular structure of a 
polymorphic region of a gene comprises determining the identity of the allelic variant. A polymorphic 
region of a gene, of which specific alleles are associated with cardiovascular disease can be located in 
an exon, an intron, at an intron/exon border, or in the promoter of the gene. 
[0034] The invention provides methods for determining whether a subject has, or is at risk, of 
developing a cardiovascular disease. Such disorders can be associated with an aberrant gene activity, 
e.g., abnormal binding to a form of a lipid, or an aberrant gene protein level. An aberrant gene protein 
level can result from an aberrant transcription or post-transcriptional regulation. Thus, allelic 
differences in specific regions of a gene can result in differences of gene protein due to differences in 
regulation of expression. In particular, some of the identified polymorphisms in the human gene may 
be associated with differences in the level of transcription, RNA maturation, splicing, or translation 
of the gene or transcription product. 
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[0035] The present invention provides isolated nucleic acids comprising the polymorphic positions 
described herein for human genes; vectors comprising the nucleic acids; and transformed host cells 
comprising the vectors. The invention also provides probes which are useful for detecting these 
polymorphisms. 

[0036] The present invention encompasses isolated peptides and polypeptides encoded by genes 
listed in the Examples comprising polymorphic positions disclosed herein. In one preferred 
embodiment, the peptides and polypeptides are useful screening targets to identify cardiovascular 
drugs. In another preferred embodiments, the peptides and polypeptides are capable of eliciting 
antibodies in a suitable host animal that react specifically with a polypeptide comprising the 
polymorphic position and distinguish it from other polypeptides having a different sequence at that 
position. 

[0037] The invention provides diagnostic methods, e.g., for determining the identity of the allelic 
variants of polymorphic regions present in the gene loci of genes disclosed herein, wherein specific 
allelic variants of the polymorphic region are associated with cardiovascular diseases. In a preferred 
embodiment, the diagnostic kit can be used to determine whether a subject is at risk of developing a 
cardiovascular disease. This information could then be used, e.g., to optimize treatment of such 
individuals. 

[0038] The invention also provides antibody-based methods for detecting polymorphic patterns in 
a biological sample. The methods comprise the steps of: (i) contacting a sample with one or more 
antibody preparations, wherein each of the antibody preparations is specific for a particular 
polymorphic form of the proteins encoded by genes disclosed herein, under conditions in which a 
stable antigen-antibody complex can form between the antibody and antigenic components in the 
sample; and (ii) detecting any antigen-antibody complex formed in step (i) using any suitable means 
known in the art, wherein the detection of a complex indicates the presence of the particular 
polymorphic form in the sample. 

[0039] According to the present invention, nucleotide sequences derived from genes disclosed 
herein and peptide sequences encoded by genes disclosed herein, particularly those that contain one 
or more polymorphic sequences, comprise useful targets to identify cardiovascular drugs, i.e., 
compounds that are effective in treating one or more clinical symptoms of cardiovascular disease. 
Furthermore, especially when a protein is a multimeric protein that are build of two or more subunits, 
is a combination of different polymorphic subunits very useful. 

[0040] Additional aspects, advantages, and features of the invention will be set forth, in part, in the 
description that follows, and in part, will become apparent to those skilled in the art upon 
examination of the following, or may be learned by practice of the invention. 
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Detailed Description of the Invention 
[0041] Definitions and Nomenclature 

[0042] For convenience, the meaning of certain terms and phrases used in the specification, 
examples, and appended claims are provided below. The definitions are also provided to further 
expand and explain the background of the invention. 

[0043] The term "allele," which is used interchangeably herein with "allelic variant" refers to 
alternative forms of a gene or portions thereof. Alleles occupy the same locus or position on 
homologous chromosomes. When a subject has two identical alleles of a gene, the subject is said to 
be homozygous for the gene or allele. When a subject has two different alleles of a gene, the subject 
is said to be heterozygous for the gene. Alleles of a specific gene can differ from each other in a 
single nucleotide, or several nucleotides, and can include substitutions, deletions, and insertions of 
nucleotides. An allele of a gene can also be a form of a gene containing a mutation. 
[0044] The term "allelic variant of a polymorphic region of a gene" refers to a region of a gene 
having one of several nucleotide sequences found in that region of the gene in other individuals. 
[0045] "Homology" or "identity" or "similarity" refers to sequence similarity between two peptides 
or between two nucleic acid molecules. Homology can be determined by comparing a position in 
each sequence which may be aligned for purposes of comparison. When a position in the compared 
sequence is occupied by the same base or amino acid, then the molecules are homologous at that 
position. A degree of homology between sequences is a function of the number of matching or 
homologous positions shared by the sequences. An "unrelated" or "non-homologous" sequence shares 
less than 40% identity, though preferably less than 25% identity, with one of the sequences of the 
present invention. 

[0046] The term "a homologue of a nucleic acid" refers to a nucleic acid having a nucleotide 

sequence having a certain degree of homology with the nucleotide sequence of the nucleic acid or 

complement thereof. A homologue of a double stranded nucleic acid having SEQ ID NO. X is 

intended to include nucleic acids having a nucleotide sequence which has a certain degree of 

homology with SEQ ID NO. X or with the complement thereof. Preferred homologous of nucleic 

acids are capable of hybridizing to the nucleic acid or complement thereof. 

[0047] The term "interact" as used herein is meant to include detectable interactions between 

molecules, such as can be detected using, for example, a hybridization assay. 

[0048] The term interact is also meant to include "binding" interactions between molecules. 

Interactions may be, for example, protein-protein, protein-nucleic acid, protein-small molecule or 

small molecule-nucleic acid in nature. 

[0049] The term "intronic sequence" or "intronic nucleotide sequence" refers to the nucleotide 
sequence of an intron or portion thereof. 
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[0050] The term "isolated" as used herein with respect to nucleic acids, such as DNA or RNA, 
refers to molecules separated from other DNAs or RNAs, respectively, that are present in the natural 
source of the macromolecule. The term isolated as used herein also refers to a nucleic acid or peptide 
that is substantially free of cellular material, viral material, or culture medium when produced by 
recombinant DNA techniques, or chemical precursors or other chemicals when chemically 
synthesized. 

[0051] Moreover, an "isolated nucleic acid" is meant to include nucleic acid fragments which are 
not naturally occurring as fragments and would not be found in the natural state. The term "isolated" 
is also used herein to refer to polypeptides which are isolated from other cellular proteins and is 
meant to encompass both purified and recombinant polypeptides. 

[0052] The term "lipid" shall refer to a fat or fat-like substance that is insoluble in polar solvents 
such as water. The term "lipid" is intended to include true fats (e.g., esters of fatty acids and 
glycerol); lipids (phospholipids, cerebrosides, waxes); sterols (cholesterol, ergosterol) and 
lipoproteins (e.g., HDL, LDL and VLDL). 

[0053] The term "locus" refers to a specific position in a chromosome. For example, a locus of a 
gene refers to the chromosomal position of the gene. 

[0054] The term "modulation" as used herein refers to both up-regulation, (i.e., activation or 
stimulation), for example by agonizing, and down-regulation (i.e. inhibition or suppression), for 
example by antagonizing of a bioactivity (e.g., expression of a gene). 

[0055] The term "molecular structure" of a gene or a portion thereof refers to the structure as 
defined by the nucleotide content (including deletions, substitutions, additions of one or more 
nucleotides), the nucleotide sequence, the state of methylation, and/or any other modification of the 
gene or portion thereof. 

[0056] The term "mutated gene" refers to an allelic form of a gene, which is capable of altering the 
phenotype of a subject having the mutated gene relative to a subject which does not have the mutated 
gene. If a subject must be homozygous for this mutation to have an altered phenotype, the mutation is 
said to be recessive. If one copy of the mutated gene is sufficient to alter the genotype of the subject, 
the mutation is said to be dominant. If a subject has one copy of the mutated gene and has a 
phenotype that is intermediate between that of a homozygous and that of a heterozygous (for that 
gene) subject, the mutation is said to be co-dominant. 

[0057] As used herein, the term "nucleic acid" refers to polynucleotides such as deoxyribonucleic 
acid (DNA), and, where appropriate, ribonucleic acid (RNA). The term should also be understood to 
include, as equivalents, derivatives, variants and analogs of either RNA or DNA made from 
nucleotide analogs, including peptide nucleic acids (PNA), morpholino oligonucleotides (J. 
Summerton and D. Weller, Antisense and Nucleic Acid Drug Development 7:187 (1997)) and, as 
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applicable to the embodiment being described, single (sense or antisense) and double-stranded 
polynucleotides. Deoxyribonucleotides include deoxyadenosine, deoxycytidine, deoxyguanosine, and 
deoxythymidine. For purposes of clarity, when referring herein to a nucleotide of a nucleic acid, 
which can be DNA or an RNA, the term "adenosine," "cytidine," "guanosine," and "thymidine" are 
used. It is understood that if the nucleic acid is RNA, a nucleotide having a uracil base is uridine. 
[0058] The term "nucleotide sequence complementary to the nucleotide sequence set forth in SEQ 
ID NO. x" refers to the nucleotide sequence of the complementary strand of a nucleic acid strand 
having SEQ ID NO. x. The term "complementary strand" is used herein interchangeably with the 
term "complement." The complement of a nucleic acid strand can be the complement of a coding 
strand or the complement of a non-coding strand. When referring to double stranded nucleic acids, 
the complement of a nucleic acid having SEQ ID NO. x refers to the complementary strand of the 
strand having SEQ ID NO. x or to any nucleic acid having the nucleotide sequence of the 
complementary strand of SEQ ED NO. x. When referring to a single stranded nucleic acid having the 
nucleotide sequence SEQ ED NO. x, the complement of this nucleic acid is a nucleic acid having a 
nucleotide sequence which is complementary to that of SEQ ED NO. x. The nucleotide sequences and 
complementary sequences thereof are always given in the 5' to 3' direction. The term "complement" 
and "reverse complement" are used interchangeably herein. 

[0059] The term "operably linked" is intended to mean that the promoter is associated with the 
nucleic acid in such a manner as to facilitate transcription of the nucleic acid. 
[0060] The term "polymorphism" refers to the coexistence of more than one form of a gene or 
portion thereof. A portion of a gene of which there are at least two different forms, i.e., two different 
nucleotide sequences, is referred to as a "polymorphic region of a gene." A polymorphic region can 
be a single nucleotide, the identity of which differs in different alleles. A polymorphic region can also 
be several nucleotides long. 

[0061] A "polymorphic gene" refers to a gene having at least one polymorphic region. 
[0062] To describe a "polymorphic site" in a nucleotide sequence often there is used an "ambiguity 
code" that stands for the possible variations of nucleotides in one site. The list of ambiguity codes is 
summarized in the following table: 
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Ambiguity Codes 


(IUPAC Nomenclature) 


B 


c/g/t 


D 


a/g/t 


H 


a/c/t 


K 


g/t 


M 


a/c 


N 


a/c/g/t 


R 


a/g 


S 


c/g 


V 


a/c/g 


w 


a/t 


Y 


c/t 



So, for example, a "R" in a nucleotide sequence means that either an "a" or a "g" could be at that 
position. 

[0063] The terms "protein," "polypeptide," and "peptide" are used interchangeably herein when 
referring to a gene product. 

[0064] A "regulatory element," also termed herein "regulatory sequence" is intended to include 
elements which are capable of modulating transcription from a basic promoter and include elements 
such as enhancers and silencers. The term "enhancer," also referred to herein as "enhancer element," 
is intended to include regulatory elements capable of increasing, stimulating, or enhancing 
transcription from a basic promoter. The term "silencer," also referred to herein as "silencer element" 
is intended to include regulatory elements capable of decreasing, inhibiting, or repressing 
transcription from a basic promoter. Regulatory elements are typically present in 5' flanking regions 
of genes. However, regulatory elements have also been shown to be present in other regions of a 
gene, in particular in introns. Thus, it is possible that genes have regulatory elements located in 
introns, exons, coding regions, and 3' flanking sequences. Such regulatory elements are also intended 
to be encompassed by the present invention and can be identified by any of the assays that can be 
used to identify regulatory elements in 5' flanking regions of genes. 

[0065] The term "regulatory element" further encompasses "tissue specific" regulatory elements, 
i.e., regulatory elements which effect expression of the selected DNA sequence preferentially in 
specific cells (e.g., cells of a specific tissue), gene expression occurs preferentially in a specific cell if 
expression in this cell type is significantly higher than expression in other cell types. The term 
"regulatory element" also encompasses non-tissue specific regulatory elements, i.e., regulatory 
elements which are active in most cell types. Furthermore, a regulatory element can be a constitutive 
regulatory element, i.e., a regulatory element which constitutively regulates transcription, as opposed 
to a regulatory element which is inducible, i.e., a regulatory element which is active primarily in 



-12- 



AttyDktNo. 1300-0015 
PATENT 



response to a stimulus. A stimulus can be, e.g., a molecule, such as a hormone, cytokine, heavy metal, 
phorbol ester, cyclic AMP (cAMP), or retinoic acid. 

[0066] Regulatory elements are typically bound by proteins, e.g., transcription factors. The term 
"transcription factor" is intended to include proteins or modified forms thereof, which interact 
preferentially with specific nucleic acid sequences, i.e., regulatory elements, and which in appropriate 
conditions stimulate or repress transcription. Some transcription factors are active when they are in 
the form of a monomer. Alternatively, other transcription factors are active in the form of a dimer 
consisting of two identical proteins or different proteins (heterodimer). Modified forms of 
transcription factors are intended to refer to transcription factors having a post-translational 
modification, such as the attachment of a phosphate group. The activity of a transcription factor is 
frequently modulated by a post-translational modification. For example, certain transcription factors 
are active only if they are phosphorylated on specific residues. Alternatively, transcription factors can 
be active in the absence of phosphorylated residues and become inactivated by phosphorylation. A 
list of known transcription factors and their DNA binding site can be found, e.g., in public databases, 
e.g., TFMATRDC Transcription Factor Binding Site Profile database. 

[0067] As used herein, the term "specifically hybridizes" or "specifically detects" refers to the 
ability of a nucleic acid molecule of the invention to hybridize to at least approximately 6, 12, 20, 30, 
40, 50, 60, 70, 80, 90, 100, 1 10, 120, 130 or 140 consecutive nucleotides of either strand of a gene. 
[0068] The term "wild-type allele" refers to an allele of a gene which, when present in two copies 
in a subject results in a wild-type phenotype. There can be several different wild-type alleles of a 
specific gene, since certain nucleotide changes in a gene may not affect the phenotype of a subject 
having two copies of the gene with the nucleotide changes. 

[0069] "Adverse drug reaction" (ADR) as used herein refers to an appreciably harmful or 
unpleasant reaction, resulting from an intervention related to the use of a medicinal product, which 
predicts hazard from future administration and warrants prevention or specific treatment, or alteration 
of the dosage regimen, or withdrawal of the product. In it's most severe form an ADR might lead to 
the death of an individual. 

[0070] The term "Drug Response" is intended to mean any response that a patient exhibits upon 
drug administration. Specifically drug response includes beneficial, i.e. desired drug effects, ADR or 
no detectable reaction at all. More specifically the term drug response could also have a qualitative 
meaning, i.e. it embraces low or high beneficial effects, respectively and mild or severe ADR, 
respectively. The term "Statin Response" as used herein refers to drug response after statin 
administration. An individual drug response includes also a good or bad metabolizing of the drug, 
meaning that "bad metabolizers" accumulate the drug in the body and by this could show side effects 
of the drug due to accumulative overdoses. 
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[0071] "Candidate gene" as used herein includes genes that can be assigned to either normal 
cardiovascular function or to metabolic pathways that are related to onset and/or progression of 
cardiovascular diseases. 

[0072] With regard to drug response the term "candidate gene" includes genes that can be assigned 
to distinct phenotypes regarding the patient's response to drug administration. Those phenotypes may 
include patients who benefit from relatively small amounts of a given drug (high responders) or 
patients who need relatively high doses in order to obtain the same benefit (low responders). In 
addition those phenotypes may include patients who can tolerate high doses of a medicament without 
exhibiting ADR, or patients who suffer from ADR even after receiving only low doses of a 
medicament. 

[0073] As neither the development of cardiovascular diseases nor the patient's response to drug 
administration is completely understood, the term "candidate gene" may also comprise genes with 
presently unknown function. 

[0074] "PA SNP" (phenotype associated SNP) refers to a polymorphic site which shows a 
significant association with a patients phenotype (healthy, diseased, low or high responder, drug 
tolerant, ADR prone, etc.) 

[0075] "PA gene" (phenotype associated gene) refers to a genomic locus harbouring a PA SNP, 
irrespective of the actual function of this gene locus. 

[0076] PA gene polypeptide refers to a polypeptide encoded at least in part by a PA gene. 
[0077] The term "Haplotype" as used herein refers to a group of two or more SNPs that are 
functionally and/or spatially linked. I.e. haplotypes define groups of SNPs that lie inside genes 
belonging to identical (or related metabolic) pathways and/or lie on the same chromosome. 
Haplotypes are expected to give better predictive/diagnostic information than a single SNP 
[0078] The term "statin" is intended to embrace all inhibitors of the enzyme 3-hydroxy-3- 
methylglutaryl coenzyme A (HMG-CoA) reductase. Statins specifically inhibit the enzyme HMG- 
CoA reductase which catalyzes the rate limiting step in cholesterol biosynthesis. Known statins are 
Atorvastatin, Cerivastatin, Fluvastatin, Lovastatin, Pravastatin and Simvastatin. 
[0079] The present invention is based at least in part on the discovery that a specific allele of a 
polymorphic region of a so called "candidate gene" (as defined below) is associated with CVD or 
drug response. 

[0080] For the present invention the following candidate genes were analyzed: genes found to be 
expressed in cardiac tissue (Hwang et al., CIRCULATION 96:4146-4203 (1997)); and genes from the 
following metabolic pathways and their regulatory elements: 
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[0081] Lipid metabolism 

[0082] Numerous studies have shown a connection between serum lipid levels and cardiovascular 
diseases. Candidate genes falling into this group include but are not limited by genes of the 
cholesterol pathway, apolipoproteins and their modifiying factors. 
[0083] Coagulation 

[0084] Ischemic diseases of the heart and in particular myocardial infarction may be caused by a 
thrombotic occlusion. Genes falling into this group include all genes of the coagulation cascade and 
their regulatory elements. 
[0085] Inflammation 

[0086] Complications of atherosclerosis are the most common causes of death in Western 
societies. In broad outline atherosclerosis can be considered to be a form of chronic inflammation 
resulting from interaction modified lipoproteins, monocyte-derived macrophages,T cells, and the 
normal cellular elements of the arterial wall. This inflammatory process can ultimately lead to the 
development of complex lesions, or plaques, that protrude into the arterial lumen. Finally plaque 
rupture and thrombosis result in the acute clinical complications of myocardial infarction and stroke 
(Glass et al., Cell 104:503-516 (2001)). 

[0087] It follows that all genes related to inflammatory processes, including but not limited by 
cytokines, cytokine receptors and cell adhesion molecules are candidate genes for CVD. 
[0088] Glucose and energy metabolism 

[0089] As glucose and energy metabolism is interdependent with the metabolism of lipids (see 
above) also the former pathways contain candidate genes. Energy metabolism in general also relates 
to obesity, which is an independent risk factor for CVD (Melanson et al., CARDIOL Rev 9:202-207 
(2001)). In addition high blood glucose levels are associated with many microvascular and 
macrovascular complications and may therefore affect an individuals disposition to CVD 
(Duckworth, CURR AtherosclerRep, 3:383-391 (2001)). 
[0090] Hypertension 

[0091] As hypertension is an independent risk factor for CVD, also genes that are involved in the 
regulation of systolic and diastolic blood pressure affect an individuals risk for CVD (Safar, CURR 
OPIN CARDIOL, 15:258-263 (2000)). Interestingly hypertension and diabetes (see above) appear to be 
interdependent, since hypertension is approximately twice as frequent in patients with diabetes 
compared with patients without the disease. Conversely, recent data suggest that hypertensive 
persons are more predisposed to the development of diabetes than are normotensive persons (Sowers 
et al., Hypertension 37:1053-1059 (2001)). 
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[0092] Genes related to drug response 

[0093] Those genes include metabolic pathways involved in the absorption, distribution, 
metabolism, excretion and toxicity (ADMET) of drugs. Prominent members of this group are the 
cytochrome P450 proteins which catalyze many reactions involved in drug metabolism. 
[0094] Unclassified genes 

[0095] As stated above, the mechanisms that lead to cardiovascular diseases or define the patient's 
individual response to drugs are not completely elucidated. Hence also candidate genes were 
analysed, which could not be assigned to the above listed categories. The present invention is based 
at least in part on the discovery of polymorphisms, that lie in genomic regions of unknown 
physiological function. 
[0096] Results 

[0097] After conducting an association study, we surprisingly found polymorphic sites in a number 
of candidate genes which show a strong correlation with the following phenotypes of the patients 
analysed. "Healthy" as used herein refers to individuals that neither suffer from existing CVD, nor 
exhibit an increased risk for CVD through their serum lipid level profile. "CVD prone" as used herein 
refers to individuals with existing CVD and/or a serum lipid profile that confers a high risk to get 
CVD (see Table la for definitions of healthy and CVD prone serum lipid levels). "High responder" as 
used herein refers to patients who benefit from relatively small amounts of a given drug. "Low 
responder" as used herein refers to patients who need relatively high doses in order to obtain benefit 
from the medication. "Tolerant patient" refers to individuals who can tolerate high doses of a 
medicament without exhibiting adverse drug reactions. "ADR patient" as used herein refers to 
individuals who suffer from ADR or show clinical symptoms (like creatine kinase elevation in blood) 
even after receiving only minor doses of a medicament (see Table lb for a detailed definition of drug 
response phenotypes). 

[0098] Polymorphic sites in candidate genes that were found to be significantly associated with 
either of the above mentioned phenotypes will be referred to as "phenotype associated SNPs" (PA 
SNPs). The respective genomic loci that harbour PA SNPs will be referred to as "phenotype 
associated genes" (PA genes), irrespective of the actual function of this gene locus. 
[0099] In particular we surprisingly found PA SNPs associated with CVD, drug efficacy (EFF) or 
adverse drug reactions (ADR) in the following genes. 

[0100] ABCB11: ATP-binding cassette, sub-family B (MDR/TAP), member 11 
[0101] The membrane-associated protein encoded by this gene is a member of the superfamily of 
ATP-binding cassette (ABC) transporters. ABC proteins transport various molecules across extra- 
and intra-cellular membranes. ABC genes are divided into seven distinct subfamilies (ABC1, 
MDR/TAP, MRP, ALD, OABP, GCN20, White). This protein is a member of the MDR/TAP 
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subfamily. Members of the MDR/TAP subfamily are involved in multidrug resistance. The protein 
encoded by this gene is the major canalicular bile salt export pump in man. Mutations in this gene 
cause a form of progressive familial intrahepatic cholestases which are a group of inherited disorders 
with severe cholestatic liver disease from early infancy. 

[0102] ABCB4: ATP-binding cassette, sub-family B (MDR/TAP), member 4 
[0103] The membrane-associated protein encoded by this gene is a member of the superfamily of 
ATP-binding cassette (ABC) transporters. ABC proteins transport various molecules across extra- 
and intra-cellular membranes. ABC genes are divided into seven distinct subfamilies (ABC1, 
MDR/TAP, MRP, ALD, OABP, GCN20, White). This protein is a member of the MDR/TAP 
subfamily. Members of the MDR/TAP subfamily are involved in multidrug resistance as well as 
antigen presentation. This gene encodes a full transporter and member of the p-glycoprotein family of 
membrane proteins with phosphatidylcholine as its substrate. The function of this protein has not yet 
been determined; however, it may involve transport of phospholipids from liver hepatocytes into bile. 
Alternative splicing of this gene results in several products of undetermined function. 
[0104] ABCC1: ATP-binding cassette, sub-family C (CFTR/MRP), member 1 
[0105] The protein encoded by this gene is a member of the superfamily of ATP-binding cassette 
(ABC) transporters. ABC proteins transport various molecules across extra-and intra-cellular 
membranes. ABC genes are divided into seven distinct subfamilies (ABC1, MDR/TAP, MRP, ALD, 
OABP, GCN20, White). This full transporter is a member of the MRP subfamily which is involved in 
multi-drug resistance. This protein functions as a multispecific organic anion transporter, with 
oxidized glutatione, cysteinyl leukotrienes, and activated aflatoxin Bl as substrates. This protein also 
transports glucuronides and sulfate conjugates of steroid hormones and bile salts. Alternative splicing 
by exon deletion results in several splice variants but maintains the original open reading frame in all 
forms. 

[0106] ACTB mRNA for mutant beta-actin 

[0107] Beta actin is one of six different actin isoforms which have been identified. ACTB is one of 

the two nonmuscle cytoskeletal actins. Actins are highly conserved proteins that are involved in cell 

motility, structure and integrity. Alpha actins are a major constituent of the contractile apparatus. 

[0108] ACTIN, ALPHA SKELETAL MUSCLE (ALPHA-ACTIN 1) 

[0109] Actin alpha 1 which is expressed in skeletal muscle is one of six different actin isoforms 

which have been identified. Actins are highly conserved proteins that are involved in cell motility, 

structure and integrity. Alpha actins are a major constituent of the contractile apparatus. 

[0110] ADCYAP1: adenylate cyclase activating polypeptide 1 (pituitary) 

[0111] This gene encodes adenylate cyclase activating polypeptide 1 . Mediated by adenylate 

cyclase activating polypeptide 1 receptors, this polypeptide stimulates adenylate cyclase and 
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subsequently increases the cAMP level in target cells. Adenylate cyclase activating polypeptide 1 is 
not only a hypophysiotropic hormone, but also functions as a neurotransmitter and neuromodulator. 
In addition, it plays a role in paracrine and autocrine regulation of certain types of cells. This gene is 
composed of five exons. Exons 1 and 2 encode the 5* UTR and signal peptide, respectively; exon 4 
encodes an adenylate cyclase activating polypeptide 1 -related peptide; and exon 5 encodes the mature 
peptide and 3' UTR. This gene encodes three different mature peptides, including two isotypes: a 
shorter form and a longer form. 
[0112] ADRB3: adrenergic, beta-3-, receptor 

[0113] The ADRB3 gene product, beta-3 -adrenergic receptor, is located mainly in adipose tissue 
and is involved in the regulation of lipolysis and thermogenesis. Beta adrenergic receptors are 
involved in the epenephrine and norepinephrine-induced activation of adenylate cyclase through the 
action of G proteins. 

[0114] AGL: amylo-1, 6-glucosidase, 4-alpha-glucanotransferase (glycogen debranching enzyme, 
glycogen storage disease type HI) 

[0115] Glycogen debranching enzyme is involved in glycogen degradation and has two 
independent catalytic activities: a 4-alpha-glucotransferase activity (EC 2.4.1.25) and a amylo-1, 6- 
glucosidase activity (EC 3.4.1.33). Both activities occur at different sites on the single polypeptide 
chain. Mutations in this gene cause glycogen storage disease. A wide range of clinical and enzymatic 
variability occurs in glycogen debrancher deficiency, some of which may be due to tissue-specific 
alternative splicing. Six splice varients that differ in the 5' end have been identified in liver and 
muscle tissue. Variants 1, 5, and 6 are present in both liver and muscle, whereas variants 2, 3, and 4 
occur in muscle. Variants 1 through 4 encode identical proteins (isoform 1) that include 27 N- 
terminal amino acids not found in splice variants 5 and 6. Variants 5 and 6 encode different amino- 
terminal ends of 10 and 1 1 amino acids in protein isoforms 2 and 3, respectively, with the remainder 
of the peptide identical to that of isoforms 1.. 
[0116] AKAP1: A kinase (PRKA) anchor protein 1 

[0117] Anchors cAMP-dependent protein kinase near its physiological substrates, interacts with 
both the type I and type II regulatory subunits. 
[01 18] Angiotensinogen gene 

[0119] The protein encoded by this gene, pre-angiotensinogen or angiotensinogen precursor, is 
expressed in the liver and is cleaved by the enzyme renin in response to lowered blood pressure. The 
resulting product, angiotensin I is then cleaved by angiotensin converting enzyme (ACE) to generate 
the physiologically active enzyme angiotensin II. The protein is involved in maintaining blood 
pressure and in the pathogenesis of essential hypertension and preeclampsia. 
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[0120] ANXA6: annexin A6 

[0121] Annexin VI belongs to a family of calcium-dependent membrane and phospholipid binding 
proteins. Although their functions are still not clearly defined, several members of the annexin family 
have been implicated in membrane-related events along exocytotic and endocytotic pathways. The 
annexin VI gene is approximately 60 kbp long and contains 26 exons. It encodes a protein of about 68 
kDa that consists of eight 68-amino acid repeats separated by linking sequences of variable lengths. It 
is highly similar to human annexins I and II sequences, each of which contain four such repeats. Exon 
21 of annexin VI is alternatively spliced, giving rise to two isoforms that differ by a 6-amino acid 
insertion at the start of the seventh repeat. Annexin VI has been implicated in mediating the 
endosome aggregation and vesicle fusion in secreting epithelia during exocytosis. 
[0122] AP2B1: adaptor-related protein complex 2, beta 1 subunit 

[0123] The beta adaptin subunit is part of the clathrin coat assembly complex which links clathrin 
to receptors in coated pits and vesicles. These vesicles are involved in endocytosis and Golgi 
processing. The beta 1 subunit is one of the assembly proteins which binds to clathrin and initiates 
coat formation. 

[0124] APOA1: apolipoprotein A-I 

[0125] APOA1 promotes cholesterol efflux from tissues to the liver for excretion. Apolipoprotein 
A-I is the major protein component of high density lipoprotein (HDL) in the plasma. Synthesized in 
the liver and small intestine, it consists of two identical chains of 77 amino acids; an 18-amino acid 
signal peptide is removed co-translationally and a 6-amino acid propeptide is cleaved post- 
translationally. Variation in the latter step, in addition to modifications leading to so-called isoforms, 
is responsible for some of the polymorphism observed. APOA1 is a cofactor for lecithin 
cholesterolacyl transferase (LCAT) which is responsible for the formation of most plasma cholesteryl 
esters. The APOA1, APOC3 and APOA4 genes are closely linked in both rat and human genomes. 
The A-I and A-IV genes are transcribed from the same strand, while the C-III gene is transcribed 
convergently in relation to A-I. Defects in the apolipoprotein A-l gene are associated with HDL 
deficiency and Tangier disease. 
[0126] APOA4: apolipoprotein A-IV 

[0127] Apoliprotein (apo) A-IV gene contains 3 exons separated by two introns. A sequence 
polymorphism has been identified in the 3'UTR of the third exon. The primary translation product is a 
396-residue preprotein which after proteolytic processing is secreted its primary site of synthesis, the 
intestine, in association with chylomicron particles. Although its precise function is not known, apo 
A-IV is a potent activator of lecithin-cholesterol acyltransferase in vitro. 
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[0128] APOB: apolipoprotein B 

[0129] Apolipoprotein B (ApoB) is the main apolipoprotein of chylomicrons and low density 
lipoproteins (LDL). The protein occurs in the plasma in 2 main isoforms, apoB-48 and apoB-100. 
The first is synthesized exclusively by the gut, the second by the liver. The intestinal (B-48) and 
hepatic (B-100) forms of apoB are coded by a single gene and by a single mRNA transcript larger 
than 16 kb. The 2 proteins share a common amino terminal sequence. In the ApoB- 100 isoform the 
precursor has 4,563 amino acids, and the mature apoB-100 has 4,536 amino acid residues. Mature, 
circulating B-48 is homologous over its entire length (estimated to be between 2,130 and 2,144 amino 
acid residues) with the amino-terminal portion of B-100 and contains no sequence from the carboxyl 
end of B-100. From structural studies, it is thought that apoB-48 represents the amino-terminal 47% 
of apoB-100 and that the carboxyl terminus of apoB-48 is in the vicinity of residue 2151 of apoB- 
100. Apolipoprotein B-48 may be the product of an intestinal mRNA with an in-frame UAA stop 
codon resulting from a C-to-U change in the codon CAA encoding Gln(2153) in apoB-100 mRNA. 
Since only the sequence that codes B-100 is present in genomic DNA, this presents the possibility of 
an organ-specific introduction of a stop codon to an mRNA and the change from CAA to UAA of 
codon 2153 of the message as a unique RNA editing process.. 
[0130] APOD: apolipoprotein D 

[0131] Apolipoprotein D (Apo-D) is a component of high density lipoprotein that has no marked 
similarity to other apolipoprotein sequences. It has a high degree of homology to plasma retinol- 
binding protein and other members of the alpha 2 microglobulin protein superfamily of carrier 
proteins, also known as lipocalins. It is a glycoprotein of estimated molecular weight 33 KDa. Apo-D 
is closely associated with the enzyme lecithinxholesterol acyltransferase - an enzyme involved in 
lipoprotein metabolism. 
[0132] Apolipoprotein B 

[0133] Apolipoprotein B (ApoB) is the main apolipoprotein of chylomicrons and low density 
lipoproteins (LDL). The protein occurs in the plasma in 2 main isoforms, apoB-48 and apoB-100. 
The first is synthesized exclusively by the gut, the second by the liver. The intestinal (B-48) and 
hepatic (B-100) forms of apoB are coded by a single gene and by a single mRNA transcript larger 
than 16 kb. The 2 proteins share a common amino terminal sequence. In the ApoB-100 isoform the 
precursor has 4,563 amino acids, and the mature apoB-100 has 4,536 amino acid residues. Mature, 
circulating B-48 is homologous over its entire length (estimated to be between 2,130 and 2,144 amino 
acid residues) with the amino-terminal portion of B-100 and contains no sequence from the carboxyl 
end of B-100. From structural studies, it is thought that apoB-48 represents the amino-terminal 47% 
of apoB-100 and that the carboxyl terminus of apoB-48 is in the vicinity of residue 2151 of apoB- 
100. Apolipoprotein B-48 may be the product of an intestinal mRNA with an in-frame UAA stop 
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codon resulting from a C-to-U change in the codon CAA encoding Gln(2153) in apoB-100 mRNA. 
Since only the sequence that codes B-100 is present in genomic DNA, this presents the possibility of 
an organ-specific introduction of a stop codon to an mRNA and the change from CAA to UAA of 
codon 2153 of the message as a unique RNA editing process.. 
[0134] APXL: apical protein-like (Xenopus laevis) 

[0135] The protein encoded by this gene shares significant similarities with the apical protein from 
Xenopus laevis which is implicated in amiloride-sensitive sodium channel activity. This gene is a 
strong candidate gene for ocular albinism type 1 syndrome. 
[0136] ARF4: ADP-ribosylation factor 4 

[0137] ADP-ribosylation factor 4 (ARF4) is a member of the human ARF gene family. These 
genes encode small guanine nucleotide-binding proteins that stimulate the ADP-ribosyltransferase 
activity of cholera toxin and play a role in vesicular trafficking and as activators of phospholipase D. 
The gene products include 6 ARF proteins and 1 1 ARF-like proteins and constitute 1 family of the 
RAS superfamily. The ARF proteins are categorized as class I (ARF1, ARF2,and ARF3), class II 
(ARF4 and ARF5) and class HI (ARF6). The members of each class share a common gene 
organization. The ARF4 gene spans approximately 12kb and contains six exons and five introns. The 
ARF4 is the most divergent member of the human ARFs. Conflicting Map positions at 3pl4 or 3p2 1 
have been reported for this gene. 

[0138] ATP1A2: ATPase, Na+/K+ transporting, alpha 2 (+) polypeptide 

[0139] Alpha 2 subunit of the sodium- and potassium-transporting ATPase; required for Na+ and 

K+ gradient maintenance across plasma membrane. 

[0140] ATP1B1: ATPase, Na+/K+ transporting, beta 1 polypeptide 

[0141] Beta 1 subunit of Na+/K+- ATPase. 

[0142] ATP1B3: ATPase, Na+/K+ transporting, beta 3 polypeptide 
[0143] Beta 3 subunit of the Na+/K+ -ATPase. 

[0144] ATP2A2: ATPase, Ca++ transporting, cardiac muscle, slow twitch 2 

[0145] Slow twitch cardiac muscle Ca2+-ATPase; pumps calcium, may have a role in calcium 

signaling pathways. 

[0146] ATP5G1: ATP synthase, H+ transporting, mitochondrial F0 complex, subunit c 
(subunit 9), isoform 1 

[0147] Isoform 1 (PI) of subunit c, H+-translocating subunit of F0 ATP synthase; catalyzes the 
synthesis of ATP during oxidative phosphorylation. 

[0148] ATP6V1E: ATPase, H+ transporting, lysosomal 31kD, VI subunit E 

[0149] This gene encodes a component of vacuolar ATPase (V-ATPase), a multisubunit enzyme 

that mediates acidification of eukaryotic intracellular organelles. V-ATPase dependent organelle 
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acidification is necessary for such intracellular processes as protein sorting, zymogen activation, and 
receptor-mediated endocytosis. V-ATPase is comprised of a cytosolic VI domain and a 
transmembrane V0 domain. The VI domain consists of a hexamer of three A and three B subunits 
plus the C, D, and E subunits. It contains the ATP catalytic site. The encoded protein is known as the 
E subunit and is found ubiquitously. Pseudogenes for this gene have been found in the genome. 
[0150] ATPase, Ca++ transporting, cardiac muscle, fast twitch 1 
[0151] Fast-twitch skeletal muscle sarcoplasmic reticulum Ca2+-ATPase; pumps calcium. 
[0152] AXIN1: axin 

[0153] Strongly similar to murine Axin; may regulate embryonic axis formation. 
[0154] BMPR1A: bone morphogenetic protein receptor, type IA 
[0155] The bone morphogenetic protein (BMP) receptors are a family of transmembrane 
serine/threonine kinases that include the type I receptors BMPR1A and BMPR1B and the type II 
receptor BMPR2. These receptors are also closely related to the activin receptors, ACVR1 and 
ACVR2. The ligands of these receptors are members of the TGF-beta superfamily. TGF-betas and 
activins transduce their signals through the formation of heteromeric complexes with 2 different types 
of serine (threonine) kinase receptors: type I receptors of about 50-55 kD and type II receptors of 
about 70-80 kD. Type II receptors bind ligands in the absence of type I receptors, but they require 
their respective type I receptors for signaling, whereas type I receptors require their respective type II 
receptors for ligand binding. 
[0156] BRD3: bromodomain containing 3 

[0157] This gene was identified based on its homology to the gene encoding the RING3 protein, a 
serine/threonine kinase. The gene localizes to 9q34, a region which contains several major 
histocompatibility complex (MHC) genes. The function of the encoded protein is not known. 
[0158] CACNA1C: calcium channel, voltage-dependent, L type, alpha 1C subunit 
[0159] Alpha 1C subunit of the voltage-dependent calcium channel; channel is of the L type and is 
expressed in the heart. 

[0160] CALB2: calbindin 2, (29kD, calretinin) 

[0161] Calbindin 2 (calretinin), closely related to calbindin 1, is an intracellular calcium-binding 
protein belonging to the troponin C superfamily. Calbindin 1 is known to be involved in the vitamin- 
D-dependent calcium absorption through intestinal and renal epithelia, while the function of neuronal 
calbindin 1 and calbindin 2 is poorly understood. The sequence of the calbindin 2 cDNA reveals an 
open reading frame of 271 codons coding for a protein of 31,520 Da, and shares 58% identical 
residues with human calbindin 1. Calbindin 2 contains five presumably active and one presumably 
inactive calcium-binding domains. Comparison with the partial sequences available for chick and 
guinea pig calbindin 2 reveals that the protein is highly conserved in evolution. The calbindin 2 
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message was detected in the brain, while absent from heart muscle, kidney, liver, lung, spleen, 
stomach and thyroid gland. There are two additional forms of alternatively spliced calbindin 2 
mRNAs encoding C-terminally truncated proteins. Exon 7 can splice to exon 9, resulting in a frame 
shift and a translational stop at the second codon of exon 9, and encoding calretinin-20k. Exon 7 can 
also splice to exon 10, resulting in a frame shift and a translational stop at codon 15 of exon 10, and 
encoding calretinin-22k. The truncated proteins are able to bind calcium.. 

[0162] CALCIUM-TRANSPORTING ATPASE PLASMA MEMBRANE, ISOFORMS 3A/3B 

(EC 3.6.1.38) (CALCIUM PUMP) (PMCA3) 

[0163] Plasma membrane Ca2+-ATPase 3; pumps calcium. 

[0164] CALM3: calmodulin 3 (phosphorylase kinase, delta) 

[0165] Calmodulin 3; binds calcium. 

[0166] CAV1: caveolin 1, caveolae protein, 22kD 

[0167] The scaffolding protein encoded by this gene is the main component of the caveolae plasma 
membranes found in most cell types. The protein links integrin subunits to the tyrosine kinase FYN, 
an initiating step in coupling integrins to the Ras-ERK pathway and promoting cell cycle progression. 
The gene is a tumor suppressor gene candidate and a negative regulator of the Ras-p42/44 MAP 
kinase cascade. CAV1 and CAV2 are located next to each other on chromosome 7 and express 
colocalizing proteins that form a stable hetero-oligomeric complex. By using alternative initiation 
codons in the same reading frame, two isoforms (alpha and beta) are encoded by a single transcript 
from this gene. 
[0168] CAV3: caveolin 3 

[0169] This gene encodes a caveolin family member, which functions as a component of the 
caveolae plasma membranes found in most cell types. Caveolin proteins are proposed to be 
scaffolding proteins for organizing and concentrating certain caveolin-interacting molecules. 
Mutations identified in this gene lead to interference with protein oligomerization or intra-cellular 
routing, disrupting caveolae formation and resulting in Limb-Girdle muscular dystrophy type-lC 
(LGMD-1C), hyperCKemia or rippling muscle disease (RMD). Alternative splicing has been 
identified for this locus, with inclusion or exclusion of a differentially spliced intron. In addition, 
transcripts utilize multiple polyA sites and contain two potential translation initiation sites. 
[0170] CCR2: chemokine (C-C motif) receptor 2 

[0171] This gene encodes two isoforms of a receptor for monocyte chemoattractant protein-1, a 
chemokine which specifically mediates monocyte chemotaxis. Monocyte chemoattractant protein-1 is 
involved in monocyte infiltration in inflammatory diseases such as rheumatoid arthritis as well as in 
the inflammatory response against tumors. The receptors encoded by this gene mediate agonist- 
dependent calcium mobilization and inhibition of adenylyl cyclase. This gene is located in the 
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chemokine receptor gene cluster region. Two alternatively spliced transcript variants are expressed by 
the gene. 

[0172] CDH1: cadherin 1, type 1, E-cadherin (epithelial) 

[0173] This gene is a classical cadherin from the cadherin superfamily. The encoded protein is a 
calcium dependent cell-cell adhesion glycoprotein comprised of five extracellular cadherin repeats, a 
transmembrane region and a highly conserved cytoplasmic tail. Mutations in this gene are correlated 
with gastric, breast, colorectal, thyroid and ovarian cancer. Loss of function is thought to contribute 
to progression in cancer by increasing proliferation, invasion, and/or metastasis. The ectodomain of 
this protein mediates bacterial adhesion to mammalian cells and the cytoplasmic domain is required 
for internalization. Identified transcript variants arise from mutation at consensus splice sites. 
[0174] CDH11: cadherin 11, type 2, OB-cadherin (osteoblast) 

[0175] This gene encodes a type II classical cadherin from the cadherin superfamily, integral 
membrane proteins that mediate calcium-dependent cell-cell adhesion. Mature cadherin proteins are 
composed of a large N-terminal extracellular domain, a single membrane-spanning domain, and a 
small, highly conserved C-terminal cytoplasmic domain. Type II (atypical) cadherins are defined 
based on their lack of a HAV cell adhesion recognition sequence specific to type I cadherins. 
Expression of this particular cadherin in osteoblastic cell lines, and its upregulation during 
differentiation, suggests a specific function in bone development and maintenance. Two splice 
variants have been identified, one of which encodes an isoform with a truncated cytoplasmic domain. 
[0176] CDH13: cadherin 13, H-cadherin (heart) 

[0177] This gene is a member of the cadherin superfamily. The encoded protein is a calcium 
dependent cell-cell adhesion glycoprotein comprised of five extracellular cadherin repeats, a 
transmembrane region but, unlike the typical cadherin superfamily member, lacks the highly 
conserved cytoplasmic region. This particular cadherin is a putative mediator of cell-cell interaction 
in the heart and may act as a negative regulator of neural cell growth. The gene locus is 
hypermethylated or deleted in breast, ovarian and lung cancers. Two major mRNA transcripts 
encoding identical proteins are found, products of alternative polyadenylation sites. 
[0178] CENPC1: centromere protein C 1 

[0179] Centromere protein C 1 is a centromere autoantigen and a component of the inner 
kinetochore plate. The protein is required for maintaining proper kinetochore size and a timely 
transition to anaphase. A putative psuedogene exists on chromosome 12. 
[0180] Cholesteryl ester transfer protein (CETP) 

[0181] Cholestery ester transfer protein (CETP) transfers cholesteryl esters between lipoproteins. 
CETP may effect susceptibility to atherosclerosis. 
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[0182] CLCN4: chloride channel 4 

[0183] The CLCN family of voltage-dependent chloride channel genes comprises nine members 
(CLCN1-7, Ka and Kb) which demonstrate quite diverse functional characteristics while sharing 
significant sequence homology. Chloride channel 4 has an evolutionary conserved CpG island and is 
conserved in both mouse and hamster. This gene is mapped in close proximity to APXL (Apical 
protein Xenopus laevis-like) and OA1 (Ocular albinism type I), which are both located on the human 
X chromosome at band p22.3. The physiological role of chloride channel 4 remains unknown but 
may contribute to the pathogenesis of neuronal disorders. 
[0184] CLCNKA: chloride channel Ka 

[0185] Putative chloride channel; member of the CLC family of voltage-gated chloride channels. 
[0186] COL6A3: collagen, type VI, alpha 3 

[0187] This gene encodes the alpha 3 chain, one of the three alpha chains of type VI collagen, a 
beaded filament collagen found in most connective tissues. The alpha 3 chain of type VI collagen is 
much larger than the alpha 1 and 2 chains. This difference in size is largely due to an increase in the 
number of subdomains, similar to von Willebrand Factor type A domains, found in the amino 
terminal globular domain of all the alpha chains. These domains have been shown to bind 
extracellular matrix proteins, an interaction that explains the importance of this collagen in 
organizing matrix components. Mutations in the type VI collagen genes are associated with Bethlem 
myopathy. In addition to the full length transcript, four transcript variants have been identified that 
encode proteins with N-terminal globular domains of varying sizes. 

[0188] COL7A1: collagen, type VII, alpha 1 (epidermolysis bullosa, dystrophic, dominant 
and recessive) 

[0189] This gene encodes the alpha chain of type VII collagen. The type VII collagen fibril, 
composed of three identical alpha collagen chains, is restricted to the basement zone beneath 
stratified squamous epithelia. It functions as an anchoring fibril between the external epithelia and the 
underlying stroma. Mutations in this gene are associated with all forms of dystrophic epidermolysis 
bullosa. In the absence of mutations, however, an acquired form of this disease can result from an 
autoimmune response made to type VII collagen. 
[0190] COL9A3: collagen, type IX, alpha 3 

[0191] This gene encodes one of the three alpha chains of type IX collagen, the major collagen 
component of hyaline cartilage. Type IX collagen, a heterotrimeric molecule, is usually found in 
tissues containing type II collagen, a fibrillar collagen. Mutations in this gene are associated with 
multiple epiphyseal dysplasia. 
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[0192] COMT: catechoI-O-methyl transferase 

[0193] Catechol-O-methyltransferase catalyzes the transfer of a methyl group from S- 
adenosylmethionine to catecholamines, including the neurotransmitters dopamine, epinephrine, and 
norepinephrine. This O-methylation results in one of the major degradative pathways of the 
catecholamine transmitters. In addition to its role in the metabolism of endogenous substances, 
COMT is important in the metabolism of catechol drugs used in the treatment of hypertension, 
asthma, and Parkinson disease. COMT is found in two forms in tissues, a soluble form (S-COMT) 
and a membrane-bound form (MB-COMT). The differences between S-COMT and MB-COMT 
reside within the N-termini. The transcript variants are formed through the use of alternative 
translation initiation sites and promoters. 

[0194] COX10: COX10 homolog, cytochrome c oxidase assembly protein, heme A: 
farnesyltransferase (yeast) 

[0195] Cytochrome c oxidase (COX), the terminal component of the mitochondrial respiratory 
chain, catalyzes the electron transfer from reduced cytochrome c to oxygen. This component is a 
heteromeric complex consisting of 3 catalytic subunits encoded by mitochondrial genes and multiple 
structural subunits encoded by nuclear genes. The mitochondrially-encoded subunits function in 
electron transfer, and the nuclear-encoded subunits may function in the regulation and assembly of 
the complex. This nuclear gene encodes heme A:farnesyltransferase, which is not a structural subunit 
but required for the expression of functional COX and functions in the maturation of the heme A 
prosthetic group of COX. This protein is predicted to contain 7-9 transmembrane domains localized 
in the mitochondrial inner membrane. A gene mutation, which results in the substitution of a lysine 
for an asparagine (N204K), is identified to be responsible for cytochrome c oxidase deficiency. In 
addition, this gene is disrupted in patients with CMT1A (Charcot-Marie-Tooth type 1 A) duplication 
and with HNPP (hereditary neuropathy with liability to pressure palsies) deletion. . 
[0196] CPB2: carboxypeptidase B2 (plasma, carboxypeptidase U) 
[0197] Carboxypeptidases are enzymes that hydrolyze C-terminal peptide bonds. The 
carboxypeptidase family includes metallo-, serine, and cysteine carboxypeptidases. According to 
their substrate specificity, these enzymes are referred to as carboxypeptidase A (cleaving aliphatic 
residues) or carboxypeptidase B (cleaving basic amino residues). The protein encoded by this gene is 
activated by trypsin and acts on carboxypeptidase B substrates. After thrombin activation, the mature 
protein downregulates fibrinolysis. Polymorphisms have been described for this gene and its 
promoter region. Available sequence data analyses indicate splice variants that encode different 
isoforms. 
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[0198] CPO: coproporphyrinogen oxidase (coproporphyria, harderoporphyria) 
[0199] Coproporphyrinogen; catalyzes oxidative decarboxylation in sixth step of heme 
biosynthesis. 

[0200] CRYAB: crystallin, alpha B 

[0201] Crystallins are separated into two classes: taxon-specific, or enzyme, and ubiquitous. The 
latter class constitutes the major proteins of vertebrate eye lens and maintains the transparency and 
refractive index of the lens. Since lens central fiber cells lose their nuclei during development, these 
crystallins are made and then retained throughout life, making them extremely stable proteins. 
Mammalian lens crystallins are divided into alpha, beta, and gamma families; beta and gamma 
crystallins are also considered as a superfamily. Alpha and beta families are further divided into 
acidic and basic groups. Seven protein regions exist in crystallins: four homologous motifs, a 
connecting peptide, and N- and C-terminal extensions. Alpha crystallins are composed of two gene 
products: alpha-A and alpha-B, for acidic and basic, respectively. Alpha crystallins can be induced by 
heat shock and are members of the small heat shock protein (sHSP also known as the HSP20) family. 
They act as molecular chaperones although they do not renature proteins and release them in the 
fashion of a true chaperone; instead they hold them in large soluble aggregates. Post-translational 
modifications decrease the ability to chaperone. These heterogeneous aggregates consist of 30-40 
subunits; the alpha-A and alpha-B subunits have a 3:1 ratio, respectively. Two additional functions of 
alpha crystallins are an autokinase activity and participation in the intracellular architecture. Alpha-A 
and alpha-B gene products are differentially expressed; alpha-A is preferentially restricted to the lens 
and alpha-B is expressed widely in many tissues and organs. Elevated expression of alpha-B 
crystallin occurs in many neurological diseases; a missense mutation cosegregated in a family with a 
desmin-related myopathy. 

[0202] CSF2RB: colony stimulating factor 2 receptor, beta, low-affinity (granulocyte- 
macrophage) 

[0203] CSF2RB is a common beta chain of the high affinity receptor for IL-3, IL-5 and CSF. 
Defective CSF2RB has been reported to be associated with protein alveolar proteinosis. 
[0204] CUBN: cubilin (intrinsic factor-cobalamin receptor) 

[0205] Cubilin (CUBN) acts as a receptor for intrinsic factor-vitamin B 12 complexes. The role of 
receptor is supported by the presence of 27 CUB domains. Cubulin is located within the epithelium of 
intestine and kidney. Mutations in CUBN may play a role in autosomal recessive megaloblastic 
anemia. 
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[0206] CXorf6: chromosome X open reading frame 6 

[0207] CYP17: cytochrome P450, subfamily XVII (steroid 17-alpha-hydroxylase), adrenal 
hyperplasia 

[0208] This gene encodes a member of the cytochrome P450 superfamily of enzymes. The 
cytochrome P450 proteins are monooxygenases which catalyze many reactions involved in drug 
metabolism and synthesis of cholesterol, steroids and other lipids. This protein localizes to the 
endoplasmic reticulum. It has both 17-alpha-hydroxylase and 17,20-lyase activities and is a key 
enzyme in the steroidogenic pathway that produces progestins, mineralocorticoids, glucocorticoids, 
androgens, and estrogens. Mutations in this gene are associated with isolated steroid- 17 alpha- 
hydroxylase deficiency, 17-alpha-hydroxylase/ 17,20-lyase deficiency, pseudohermaphroditism, and 
adrenal hyperplasia. 

[0209] CYP2C8: cytochrome P450, subfamily IIC (mephenytoin 4-hydroxylase), 
polypeptide 8 

[0210] This gene encodes a member of the cytochrome P450 superfamily of enzymes. The cyto- 
chrome P450 proteins are monooxygenases which catalyze many reactions involved in drug 
metabolism and synthesis of cholesterol, steroids and other lipids. This protein localizes to the 
endoplasmic reticulum and its expression is induced by phenobarbital. The enzyme is known to 
metabolize many xenobiotics, including the anticonvulsive drug mephenytoin, benzo(a)pyrene, 7- 
ethyoxycoumarin, and the anti-cancer drug taxol. Two transcript variants for this gene have been 
described; it is thought that the longer form does not encode an active cytochrome P450 since its 
protein product lacks the heme binding site. This gene is located within a cluster of cytochrome P450 
genes on chromosome 10q24. 

[0211] CYP2E: cytochrome P450, subfamily HE (ethanol-inducible) 
[0212] This gene encodes a member of the cytochrome P450 superfamily of enzymes. The 
cytochrome P450 proteins are monooxygenases which catalyze many reactions involved in drug 
metabolism and synthesis of cholesterol, steroids and other lipids. This protein localizes to the 
endoplasmic reticulum and is induced by ethanol, the diabetic state, and starvation. The enzyme 
metabolizes both endogenous substrates, such as ethanol, acetone, and acetal, as well as exogenous 
substrates including benzene, carbon tetrachloride, ethylene glycol, and nitrosamines which are 
premutagens found in cigarette smoke. Due to its many substrates, this enzyme may be involved in 
such varied processes as gluconeogenesis, hepatic cirrhosis, diabetes, and cancer. 
[0213] CYP3A4 

[0214] This gene, CYP3 A4, encodes a member of the cytochrome P450 superfamily of enzymes. 
The cytochrome P450 proteins are monooxygenases which catalyze many reactions involved in drug 
metabolism and synthesis of cholesterol, steroids and other lipids. This protein localizes to the 
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endoplasmic reticulum and its expression is induced by glucocorticoids and some pharmacological 
agents. This enzyme is involved in the metabolism of approximately half the drugs which are are used 
today, including acetaminophen, codeine, cyclosporin A, diazepam and erythromycin. The enzyme 
also metabolizes some steroids and carcinogens. This gene is part of a cluster of cytochrome P450 
genes on chromosome 7q21.1. Previously another CYP3A gene, CYP3A3, was thought to exist; 
however, it is now thought that this sequence represents a transcript variant of CYP3A4. 
[0215] CYP4F8: cytochrome P450, subfamily IVF, polypeptide 8 

[0216] This gene, CYP4F8, encodes a member of the cytochrome P450 superfamily of enzymes. 
The cytochrome P450 proteins are monooxygenases which catalyze many reactions involved in drug 
metabolism and synthesis of cholesterol, steroids and other lipids. This protein localizes to the 
endoplasmic reticulum and functions as a 19-hydroxylase of prostaglandins in seminal vesicles. This 
gene is part of a cluster of cytochrome P450 genes on chromosome 19. Another member of this 
family, CYP4F3, is approximately 18 kb away. 

[0217] CYP8B1: cytochrome P450, subfamily VIIIB (sterol 12-alpha-hydroxylase), 
polypeptide 1 

[0218] This gene encodes a member of the cytochrome P450 superfamily of enzymes. The 
cytochrome P450 proteins are monooxygenases which catalyze many reactions involved in drug 
metabolism and synthesis of cholesterol, steroids and other lipids. This endoplasmic reticulum 
membrane protein catalyzes the conversion of 7 alpha-hydroxy-4-cholesten-3-one into 7-alpha,12- 
alpha-dihydroxy-4-cholesten-3-one. The balance between these two steroids determines the relative 
amounts of cholic acid and chenodeoxycholic acid both of which are secreted in the bile and affect 
the solubility of cholesterol. This gene is unique among the cytochrome P450 genes in that it is 
intronless. 

[0219] DBI: diazepam binding inhibitor (GABA receptor modulator, acyl-Coenzyme A 
binding protein) 

[0220] Diazepam binding inhibitor (acyl-CoA-binding protein); binds and induces medium-chain 
acyl-CoA ester synthesis. 

[0221] DEFA6: defensin, alpha 6, Paneth cell-specific 

[0222] Defensins are a family of microbicidal and cytotoxic peptides thought to be involved in host 
defense. They are abundant in the granules of neutrophils and also found in the epithelia of mucosal 
surfaces such as those of the intestine, respiratory tract, urinary tract, and vagina. Members of the 
defensin family are highly similar in protein sequence and distinguished by a conserved cysteine 
motif. Several alpha defensin genes appear to be clustered on chromosome 8. The protein encoded by 
this gene, defensin, alpha 6, is highly expressed in the secretory granules of Paneth cells of the small 
intestine, and likely plays a role in host defense of human bowel. 
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[0223] DEK: DEK oncogene (DNA binding) 

[0224] Site-specific DNA binding protein; involved in transcriptional regulation and signal 
transduction. 

[0225] DFNA5: deafness, autosomal dominant 5 

[0226] Hearing impairment is a heterogeneous condition with over 40 loci described. The protein 
encoded by this gene is expressed in fetal cochlea, however, its function is not known. Nonsyndromic 
hearing impairment is associated with a mutation in this gene. 
[0227] DGKD: diacylglycerol kinase, delta (130kD) 

[0228] Diacylglycerol kinase delta; phosphorylates the arachidonoyl type of diacylglycerol; 
contains a pleckstrin homology domain and an EPH domain. 
[0229] DOCK1 : dedicator of cyto-kinesis 1 

[0230] Dedicator of cyto-kinesis 1 binds to the SH3 domain of CRK protein. It may regulate cell 
surface extension and may have a role in the cell surface extension of an engulfing cell around a 
dying cell during apoptosis. 
[0231] ECE1: endothelin converting enzyme 1 

[0232] Endothelin converting enzyme; metalloprotease that regulates a peptide involved in 

vasocontriction. 

[0233] E-Selectin (CD62E) 

[0234] The endothelial leukocyte adhesion molecule- 1 is expressed by cytokine-stimulated 
endothelial cells. It is thought to be responsible for the accumulation of blood leukocytes at sites of 
inflammation by mediating the adhesion of cells to the vascular lining. It exhibits structural features 
such as the presence of lectin- and EGF-like domains followed by short consensus repeat (SCR) 
domains that contain 6 conserved cysteine residues. These proteins are part of the selectin family of 
cell adhesion molecules. This gene is present in single copy in the human genome and contains 14 
exons spanning about 13 kb of DNA. Adhesion molecules participate in the interaction between 
leukocytes and the endothelium and appear to be involved in the pathogenesis of atherosclerosis. 
[0235] ESR1: estrogen receptor 1 

[0236] Estrogen receptor; nuclear receptor transcription factor activated by ligand-binding, 
involved in hormone-mediated inhibition of gene expression. 
[0237] ESR2: estrogen receptor 2 (ER beta) 

[0238] Estrogen receptor beta 2; transcriptional activator involved in regulation of reproduction; 
exists in five isoforms. 

[0239] F2: coagulation factor II (thrombin) 

[0240] Coagulation factor II is proteolytically cleaved to form thrombin in the first step of the 
coagulation cascade which ultimately results in the stemming of blood loss. F2 also plays a role in 
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maintaining vascular integrity during development and postnatal life. Mutations in F2 leads to 

various forms of thrombosis and dysprothrombinemia. 

[0241] F3: coagulation factor III (thromboplastin, tissue factor) 

[0242] This gene encodes coagulation factor IE which is a cell surface glycoprotein. This factor 
enables cells to initiate the blood coagulation cascades, and it functions as the high-affinity receptor 
for the coagulation factor VII. The resulting complex provides a catalytic event that is responsible for 
initiation of the coagulation protease cascades by specific limited proteolysis. Unlike the other 
cofactors of these protease cascades, which circulate as nonfunctional precursors, this factor is a 
potent initiator that is fully functional when expressed on cell surfaces. There are 3 distinct domains 
of this factor: extracellular, transmembrane, and cytoplasmic. This protein is the only one in the 
coagulation pathway for which a congenital deficiency has not been described. 
[0243] F5: coagulation factor V (proaccelerin, labile factor) 

[0244] This gene encodes coagulation factor V which is an essential factor of the blood 
coagulation cascade. This factor circulates in plasma, and is converted to the active form by the 
release of the activation peptide by thrombin during coagulation. This generates a heavy chain and a 
light chain which are held together by calcium ions. The active factor V is a cofactor that participates 
with activated coagulation factor X to activate prothrombin to thrombin. Defects in this gene result in 
either an autosomal recessive hemorrhagic diathesis or an autosomal dominant form of 
thrombophilia, which is known as activated protein C resistance. 
[0245] F7: coagulation factor VII (serum prothrombin conversion accelerator) 
[0246] This gene encodes coagulation factor VII which is a vitamin K-dependent factor essential 
for hemostasis. This factor circulates in the blood in a zymogen form, and is converted to an active 
form by either factor EXa, factor Xa, factor Xlla, or thrombin by minor proteolysis. Upon activation 
of the factor VII, a heavy chain containing a catalytic domain and a light chain containing 2 EGF-like 
domains are generated, and two chains are held together by a disulfide bond. In the presence of factor 
III and calcium ions, the activated factor then further activates the coagulation cascade by converting 
factor DC to factor EXa and/or factor X to factor Xa. Alternative splicing of this gene results in 2 
transcripts. Defects in this gene can cause coagulopathy. 

[0247] F9: coagulation factor IX (plasma thromboplastic component, Christinas disease, 
hemophilia B) 

[0248] This gene encodes vitamin K-dependent coagulation factor IX that circulates in the blood as 
an inactive zymogen. This factor is converted to an active form by factor XIa, which excises the 
activation peptide and thus generates a heavy chain and a light chain held together by one or more 
disulfide bonds. The role of this activated factor DC in the blood coagulation cascade is to activate 
factor X to its active form through interactions with Ca+2 ions, membrane phospholipids, and factor 
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VIII. Alterations of this gene, including point mutations, insertions and deletions, cause factor DC 
deficiency, which is a recessive X-linked disorder, also called hemophilia B or Christmas disease. 
[0249] FABP3: fatty acid binding protein 3, muscle and heart (mammary-derived growth 
inhibitor) 

[0250] The intracellular fatty acid-binding proteins (FABPs) belongs to a multigene family. FABPs 
are divided into at least three distinct types, namely the hepatic-, intestinal- and cardiac-type. They 
form 14-15 kDa proteins and are thought to participate in the uptake, intracellular metabolism and/or 
transport of long-chain fatty acids. They may also be responsible in the modulation of cell growth and 
proliferation. Fatty acid-binding protein 3 gene contains four exons and its function is to arrest 
growth of mammary epithelial cells. This gene is a candidate tumor suppressor gene for human breast 
cancer. 

[0251] FACL3: fatty-acid-Coenzyme A ligase, long-chain 3 

[0252] The protein encoded by this gene is an isozyme of the long-chain fatty-acid-coenzyme A 
ligase family. Although differing in substrate specificity, subcellular localization, and tissue 
distribution, all isozymes of this family convert free long-chain fatty acids into fatty acyl-CoA esters, 
and thereby play a key role in lipid biosynthesis and fatty acid degradation. This isozyme is highly 
expressed in brain, and preferentially utilizes myristate, arachidonate, and eicosapentaenoate as 
substrates. The amino acid sequence of this isozyme is 92% identical to that of rat homolog. 
[0253] FACL4: fatty-acid-Coenzyme A ligase, long-chain 4 

[0254] The protein encoded by this gene is an isozyme of the long-chain fatty-acid-coenzyme A 
ligase family. Although differing in substrate specificity, subcellular localization, and tissue 
distribution, all isozymes of this family convert free long-chain fatty acids into fatty acyl-CoA esters, 
and thereby play a key role in lipid biosynthesis and fatty acid degradation. This isozyme 
preferentially utilizes arachidonate as substrate. The absence of this enzyme may contribute to the 
mental retardation or Alport syndrome. Alternative splicing of this gene generates 2 transcript 
variants. 

[0255] FMOl: flavin containing monooxygenase 1 

[0256] Metabolic N-oxidation of the diet-derived amino-trimethylamine (TMA) is mediated by 
flavin-containing monooxygenase and is subject to an inherited FM03 polymorphism in man 
resulting in a small subpopulation with reduced TMA N-oxidation capacity resulting in fish odor 
syndrome Trimethylaminuria. Three forms of the enzyme, FMOl found in fetal liver, FM02 found in 
adult liver, and FM03 are encoded by genes clustered in the Iq23-q25 region. Flavin-containing 
monooxygenases are NADPH-dependent flavoenzymes that catalyzes the oxidation of soft 
nucleophilic heteroatom centers in drugs, pesticides, and xenobiotics. 
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[0257] GAA: glucosidase, alpha; acid (Pompe disease, glycogen storage disease type II) 
[0258] This gene encodes acid alpha-glucosidase, which is essential for the degradation of 
glycogen to glucose in lysosomes. Different forms of acid alpha-glucosidase are obtained by 
proteolytic processing. Defects in this gene are the cause of glycogen storage disease n, also known 
as Pompe's disease, which is an autosomal recessive disorder with a broad clinical spectrum. 
[0259] GAPD: glyceraldehyde-3-phosphate dehydrogenase 

[0260] Glyceraldehyde-3 -phosphate dehydrogenase catalyzes an important energy-yielding step in 
carbohydrate metabolism, the reversible oxidative phosphorylation of glyceraldehyde-3 -phosphate in 
the presence of inorganic phosphate and nicotinamide adenine dinucleotide (NAD). The enzyme 
exists as a tetramer of identical chains. A GAPD pseudogene has been mapped to Xp21-pl 1 and 15 
GAPD-like loci have been identified. 
[0261] GARS: glycyl-tRNA synthetase 

[0262] Aminoacyl-tRNA synthetases are a class of enzymes that charge tRNAs with their cognate 
amino acids. Glycyl-tRNA synthetase is an (alpha)2 dimer which belongs to the class II family of 
tRNA synthetases. It has been shown to be a target of autoantibodies in the human autoimmune 
diseases, polymyositis or dermatomyositis. 

[0263] GBE1: glucan (1,4-aIpha-), branching enzyme 1 (glycogen branching enzyme, 
Andersen disease, glycogen storage disease type IV) 

[0264] This monomeric enzyme functions in glycogen symthesis by catalyzing the formation of 
alpha 1,6- glucosidic linkages. It is most highly expressed in liver and muscle. Deficiency can result 
in glycogen storage disease IV (Andersen's disease). 
[0265] GP6: glycoprotein VI (platelet) 

[0266] Platelet glycoprotein VI; member of the paired Ig-like receptor family. 
[0267] GPR-55 

[0268] Member of the G protein-coupled receptor family. 

[0269] GPRC5C: G protein-coupled receptor, family C, group 5, member C 

[0270] The protein encoded by this gene is a member of the type 3 G protein-coupled receptor 

family. Members of this superfamily are characterized by a signature 7-transmembrane domain motif. 

The specific function of this protein is unknown; however, this protein may mediate the cellular 

effects of retinoic acid on the G protein signal transduction cascade. Alternative splicing in the 5' 

UTR of this gene results in two transcript variants. 

[0271] 3-hydroxy-3-methylglutaryl coenzyme A synthase 

[0272] 3-hydroxy-3-methylglutaryl-Coenzyme A synthase; functions in the first step in 
ketogenesis. 
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[0273] HK1: hexokinase 1 

[0274] Hexokinases phosphorylate glucose to produce glucose-6-phosphate, thus committing 
glucose to the glycolytic pathway. This gene encodes a ubiquitous form of hexokinase which 
localizes to the outer membrane of mitochondria. Mutations in this gene have been associated with 
hemolytic anemia due to hexokinase deficiency. Alternative splicing of this gene results in five 
transcript variants which encode different isoforms, some of which are tissue-specific. Each isoform 
has a distinct N-terminus; the remainder of the protein is identical among all the isoforms. A sixth 
transcript variant has been described, but due to the presence of several stop codons, it is not thought 
to encode a protein. 

[0275] HLA-B associated transcript 3 (BAT3) 

[0276] A cluster of genes, BAT 1 -BATS, has been localized in the vicinity of the genes for TNF 
alpha and TNF beta. These genes are all within the human major histocompatibility complex class IE 
region. The protein encoded by this gene is a nuclear protein. It has been implicated in the control of 
apoptosis and regulating heat shock protein. There are three alternatively spliced transcript variants 
described for this gene. 

[0277] HMGCL: 3-hydroxymethyl-3-methyIglutaryl-Coenzyme A lyase (hydroxymethyl- 
glutaricaciduria) 

[0278] 3-Hydroxy-3-methylglutaryl coenzyme A lyase; cleaves 3-OH-3-methylglutaryl CoA to 

acetoacetic acid and acetyl CoA. 

[0279] HNF4A: hepatocyte nuclear factor 4, alpha 

[0280] Nuclear hormone receptor transcription factor; regulates liver specific gene expression. 
[0281] Chromosome 12 BAC RP11-13J12 
[0282] Cathepsin B 

[0283] Cathepsin B; lysosomal cysteine (thiol) protease that cleaves APP. 
[0284] Chromosome 5 clone CTD-2235C13 
[0285] Chromosome 7 clone RP11-351B12 
[0286] Cytochrome P450 3A locus 

[0287] The CYP3A locus includes all the known members of the 3A subfamily of the cytochrome 
P450 superfamily of genes. These genes encode monooxygenases which catalyze many reactions 
involved in drug metabolism and synthesis of cholesterol, steroids and other lipids. The CYP3 A 
cluster consists of four genes, CYP3A43, CYP3A4, CYP3A7 and CYP3A5. The region also contains 
two pseudogenes, CYP3A5P1 and CYP3A5P2, as well as several extra exons which may or may not 
be included in transcripts produced from this region. Previously another CYP3A member, CYP3A3, 
was thought to exist; however, it is now thought that this sequence represents a transcript variant of 
CYP3A4. 
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[0288] ITGB3 

[0289] The ITGB3 protein product is the integrin beta chain beta 3. Integrins are integral cell- 
surface proteins composed of an alpha chain and a beta chain. A given chain may combine with 
multiple partners resulting in different integrins. Integrin beta 3 is found along with the alpha lib 
chain in platelets. Integrins are known to participate in cell adhesion as well as cell-surface mediated 
signalling. 

[0290] Methionine adenosyltransferase alpha subunit gene fragment. 

[0291] MAT1A encodes methionine adenosyltransferase I (alpha isoform). MATIA catalyzes the 

formation of S-adenosylmethionine from methionine and ATP. Both the beta and alpha isoforms may 

be encoded by MATIA. Methionine adenosyltransferase deficiency is known to be caused by 

recessive as well as dominant mutations, the latter identified in autosomal dominant persistant hyper- 

methioninemia. 

[0292] Homo sapiens PAC clone RP1-102K2 from 22ql2.1-qter 

[0293] Homo sapiens partial ZNF202 gene for zinc finger protein homolog, exon 4 

[0294] Zinc-finger protein 202 may repress genes involved in lipid metabolism; contains zinc 

fingers. 

[0295] Homo sapiens vHNFl-C mRNA 
[0296] Hepatocyte Nuclear Factor 1 . 

[0297] Human 2.5 kb mRNA for cytoskeletal tropomyosin TM30(nm) 
[0298] Human c-kit gene 

[0299] KIT encodes the human homolog of the proto-oncogene c-kit. C-kit was first identified as 
the cellular homolog of the feline sarcoma viral oncogene v-kit. KIT is a type 3 transmembrane 
receptor for MGF (mast cell growth factor, also known as stem cell factor). Mutations in KIT are 
associated with gastrointestinal stromal tumors, mast cell disease, acute myelogenous lukemia, and 
piebaldism. 

[0300] Human coagulation factor VII (F7) gene exon 1 and factor X (F10) gene, exon 1 
[0301] This gene encodes coagulation factor VII which is a vitamin K-dependent factor essential 
for hemostasis. This factor circulates in the blood in a zymogen form, and is converted to an active 
form by either factor DCa, factor Xa, factor Xlla, or thrombin by minor proteolysis. Upon activation 
of the factor VII, a heavy chain containing a catalytic domain and a light chain containing 2 EGF-like 
domains are generated, and two chains are held together by a disulfide bond. In the presence of factor 
III and calcium ions, the activated factor then further activates the coagulation cascade by converting 
factor EX to factor EXa and/or factor X to factor Xa. Alternative splicing of this gene results in 2 
transcripts. Defects in this gene can cause coagulopathy. 
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[0302] Human cytochrome P450 (CYP1A2) gene, exons 1 and 2 

[0303] This gene, CYP1 A2, encodes a member of the cytochrome P450 superfamily of enzymes. 
The cytochrome P450 proteins are monooxygenases which catalyze many reactions involved in drug 
metabolism and synthesis of cholesterol, steroids and other lipids. The protein encoded by this gene 
localizes to the endoplasmic reticulum and its expression is induced by some polycyclic aromatic 
hydrocarbons (PAHs), some of which are found in cigarette smoke. The enzyme's endogenous 
substrate is unknown; however, it is able to metabolize some PAHs to carcinogenic intermediates. 
Other xenobiotic substrates for this enzyme include caffeine, aflatoxin Bl, and acetaminophen. The 
transcript from this gene contains four Alu sequences flanked by direct repeats in the 3' untranslated 
region. A related family member, CYP1A1, is located approximately 25 kb away from CYP1A2 on 
chromosome . 

[0304] Human multidrug resistance-associated protein mRNA 
[0305] SeeABCCl. 

[0306] Human succinyl CoA:3-oxoacid CoA transferase precursor (OXCT) mRNA 
[0307] The mitochondrial matrix enzyme 3-oxoacid CoA transferase is homodimeric. It is a key 
enzyme in the extrahepatic utilization of ketone bodies, catalyzing the reversible transfer of 
coenzyme A from succinyl-CoA to acetoacetate, a necessary step in ketolytic energy production. 
Deficiencies can result in intermittent ketoacidosis. 

[0308] Human T-lymphoma invasion and metastasis inducing TIAM1 protein (TIAM1) 
mRNA 

[0309] Member of the GDP-GTP exchange factor family of proteins; modulates the activity of 
Rho-like proteins; has a Dbl homology and pleckstrin homology domains. 
[0310] IL10: interleukin 10 

[0311] Interleukin 10 (cytokine synthesis inhibitory factor); functions as a specific chemotactic 

factor for CD8+T cells. 

[0312] IL17R: interleukin 17 receptor 

[0313] Highly similar to murine I117r; may play a role in T cell activation and induction of IL-2 
(112). 

[0314] IL3: interleukin 3 (colony-stimulating factor, multiple) 

[0315] Interleukin-3 (colony-stimulating factor); plays a role in hematopoeisis; member of a family 
of growth factors. 

[0316] IL6: interleukin 6 (interferon, beta 2) 

[0317] Interleukin 6 (interferon-beta 2); induces the maturation of B cells into immunoglobulin- 
secreting cells. 
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[0318] IL8RA: interleukin 8 receptor, alpha 

[0319] Interleukin 8 receptor alpha; G protein-coupled receptor that mediates neutrophil 
chemotaxis and binds interleukin 8 (IL8). 
[0320] INHBC: inhibin, beta C 

[0321] This gene encodes the beta C chain of inhibin, a member of the TGF-beta superfamily. This 
subunit forms heterodimers with beta A and beta B subunits. Inhibins and activins, also members of 
the TGF-beta superfamily, are hormones with opposing actions and are involved in hypothalamic, 
pituitary, and gonadal hormone secretion, as well as growth and differentiation of various cell types. 
[0322] ITGAL: integrin, alpha L (antigen CD11A (pl80), lymphocyte function-associated 
antigen 1; alpha polypeptide) 

[0323] ITGAL encodes the integrin alpha L chain. Integrins are heterodimeric integral membrane 
proteins composed of an alpha chain and a beta chain. This I-domain containing alpha integrin 
combines with the beta 2 chain (ITGB2) to form the integrin lymphocyte function-associated antigen- 
1 (LFA-1), which is expressed on all leukocytes. LFA-1 plays a central role in leukocyte intercellular 
adhesion through interactions with its ligands, ICAMs 1-3 (intercellular adhesion molecules 1 
through 3), and also functions in lymphocyte costimulatory signaling. 

[0324] ITGB2: integrin, beta 2 (antigen CD18 (p95), lymphocyte function-associated antigen 
1; macrophage antigen 1 (mac-1) beta subunit) 

[0325] The ITGB2 protein product is the integrin beta chain beta 2. Integrins are integral cell- 
surface proteins composed of an alpha chain and a beta chain. A given chain may combine with 
multiple partners resulting in different integrins. For example, beta 2 combines with the alpha L chain 
to form the integrin LFA-1, and combines with the alpha M chain to form the integrin Mac-1. 
Integrins are known to participate in cell adhesion as well as cell-surface mediated signalling. 
[0326] KCNQ1: potassium voltage-gated channel, KQT-like subfamily, member 1 
[0327] KCNQ1 encodes the K+ channel subunit responsible for the delayed-rectifier K+ current in 
cardiac myocytes. The delayed-rectifier channel is completed by the protein encoded by KCNE1. 
Mutations in KCNQ1 cause inherited long-QT syndrome. 

[0328] LAM A3: laminin, alpha 3 (nicein (150kD), kalinin (165kD), BM600 (150kD), 
epilegrin) 

[0329] Laminins are basement membrane components thought to mediate the attachment, 
migration and organization of cells into tissues during embryonic development by interacting with 
other extracellular matrix components. The protein encoded by this gene is the alpha-3 chain of 
laminin 5, which is a complex glycoprotein composed of three subunits (alpha, beta, and gamma). 
Laminin 5 is thought to be involved in cell adhesion, signal transduction and differentiation of 
keratinocytes. Mutations in this gene have been identified as the cause of Herlitz type junctional 
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epidermolysis bullosa. Alternative splicing has been observed at this locus but the full-length nature 
of these variants has not been determined. 

[0330] LAMR1: laminin receptor 1 (67kD, ribosomal protein SA) 

[0331] Laminins, a family of extracellular matrix glycoproteins, are the major noncollagenous 
constituent of basement membranes. They have been implicated in a wide variety of biological 
processes including cell adhesion, differentiation, migration, signaling, neurite outgrowth and 
metastasis. Many of the effects of laminin are mediated through interactions with cell surface 
receptors. These receptors include members of the integrin family, as well as non-integrin laminin- 
binding proteins. This gene encodes a high-affinity, non-integrin family, laminin receptor 1. This 
receptor has been variously called 67 kD laminin receptor, 37 kD laminin receptor precursor (37LRP) 
and p40 ribosome-associated protein. The amino acid sequence of laminin receptor 1 is highly 
conserved through evolution, suggesting a key biological function. It has been observed that the level 
of the laminin receptor transcript is higher in colon carcinoma tissue and lung cancer cell line than 
their normal counterparts. Also, there is a correlation between the upregulation of this polypeptide in 
cancer cells and their invasive and metastatic phenotype. Multiple copies of this gene exist, however, 
most of them are pseudogenes thought to have arisen from retropositional events.. 
[0332] LDLR: low density lipoprotein receptor (familial hypercholesterolemia) 
[0333] The low density lipoprotein receptor (LDLR) gene family consists of cell surface proteins 
involved in receptor-mediated endocytosis of specific ligands. Low density lipoprotein (LDL) is 
normally bound at the cell membrane and taken into the cell ending up in lysosomes where the 
protein is degraded and the cholesterol is made available for repression of microsomal enzyme 3- 
hydroxy-3-methylglutaryl coenzyme A (HMG CoA) reductase, the rate-limiting step in cholesterol 
synthesis. At the same time, a reciprocal stimulation of cholesterol ester synthesis takes place. 
Mutations in the LDL receptor (LDLR) gene cause the autosomal dominant disorder, familial 
hypercholesterolemia. 
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[0334] LGALS7: lectin, galactoside-binding, soluble, 7 (galectin 7) 

[0335] The galectins are a family of beta-galactoside-binding proteins implicated in modulating 
cell-cell and cell-matrix interactions. Differential and in situ hybridizations indicate that this lectin is 
specifically expressed in keratinocytes. It is expressed at all stages of epidermal differentiation (i.e., 
in basal and suprabasal layers). It is moderately repressed by retinoic acid. The protein was found 
mainly in stratified squamous epithelium. The antigen localized to basal keratinocytes, although it 
was also found, albeit at lower levels, in the suprabasal layers where it concentrated to areas of cell- 
to-cell contact. The cellular localization and its striking down-regulation in cultured keratinocytes 
imply a role in cell-cell and/or cell-matrix interactions necessary for normal growth control. 
[0336] LIMK1 : LIM domain kinase 1 

[0337] There are approximately 40 known eukaryotic LIM proteins, so named for the LIM domains 
they contain. LIM domains are highly conserved cysteine-rich structures containing 2 zinc fingers. 
Although zinc fingers usually function by binding to DNA or RNA, the LIM motif probably mediates 
protein-protein interactions. LEVI kinase- 1 and LIM kinase-2 belong to a small subfamily with a 
unique combination of 2 N-terminal LIM motifs and a C-terminal protein kinase domain. LIMK1 is 
likely to be a component of an intracellular signaling pathway and may be involved in brain 
development. LIMK1 hemizygosity is implicated in the impaired visuospatial constructive cognition 
of Williams syndrome. Two splice variant have been identified. 
[0338] LMNB2: lamin B2 

[0339] Lamin B2; member of a family of structural nuclear envelope proteins. 
[0340] LPL: lipoprotein lipase 

[0341] LPL encodes lipoprotein lipase, which is expressed in heart, muscle, and adipose tissue. 
LPL functions as a homodimer, and has the dual functions of triglyceride hydrolase and 
ligand/bridging factor for receptor-mediated lipoprotein uptake. Severe mutations that cause LPL 
deficiency result in type I hyperlipoproteinemia, while less extreme mutations in LPL are linked to 
many disorders of lipoprotein metabolism. 

[0342] LRP8: low density lipoprotein receptor-related protein 8, apolipoprotein e receptor 
[0343] This gene encodes an apolipoprotein E receptor, a member of the low density lipoprotein 
receptor (LDLR) family. Apolipoprotein E is a small lipophilic plasma protein and a component of 
lipoproteins such as chylomicron remnants, very low density lipoprotein (VLDL), and high density 
lipoprotein (HDL). The apolipoprotein E receptor is involved in cellular recognition and 
internalization of these lipoproteins. Alternative splicing generates three transcript variants for this 
gene; additional variants have been described, but their full length nature has not been determined. 
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[0344] LSS: lanosterol synthase (2,3-oxidosqualene-lanosteroI cyclase) 

[0345] Lanosterol synthase ((S)-2,3-epoxysqualene mutase); catalyzes the cyclization of (S)-2,3- 

oxidosqualene; forms lanosterol during sterol biosynthesis. 

[0346] LTA: lymphotoxin alpha (TNF superfamily, member 1) 

[0347] Lymphotoxin alpha, a member of the tumor necrosis factor family, is a cytokine produced 
by lymphocytes. LTA is highly inducible, secreted, and exists as homotrimeric molecule. LTA forms 
heterotrimers with lymphotoxin-beta which anchors lymphotoxin-alpha to the cell surface. LTA 
mediates a large variety of inflammatory, immunostimulatory, and antiviral responses. LTA is also 
involved in the formation of secondary lymphoid organs during development and plays a role in 
apoptosis. 

[0348] MAOA: monoamine oxidase A 

[0349] MAOA encodes monoamine oxidase A, an enzyme that degrades amine neurotransmitters, 
such as dopamine, norepinephrine, and serotonin. Deficiency of this enzyme results in Brunner 
syndrome. 

[0350] MARCKS: myristoylated alanine-rich protein kinase C substrate 

[0351] The protein encoded by this gene is a substrate for protein kinase C. It is localized to the 

plasma membrane and is an actin filament crosslinking protein. Phosphorylation by protein kinase C 

or binding to calcium-calmodulin inhibits its association with actin and with the plasma membrane, 

leading to its presence in the cytoplasm. The protein is thought to be involved in cell motility, 

phagocytosis, membrane trafficking and mitogenesis. 

[0352] MCL1: myeloid cell leukemia sequence 1 (BCL2-related) 

[0353] Similar to BCL2. 

[0354] MCP: membrane cofactor protein (CD46, trophoblast-lymphocyte cross-reactive 
antigen) 

[0355] Membrane cofactor protein; acts as the receptor for the measles virus, may be involved in 
the regulation of complement activation; contains SCRs. 
[0356] METTL1: methyltransferase-like 1 

[0357] This gene is an ortholog of the S. cerevisiae YDL201w gene, which is predicted to encode a 
methyltransferase. The gene product contains a conserved S-adenosylmethionine-binding motif, 
which is typical of a methyltransferase. Alternative splice variants encoding different protein 
isoforms and transcript variants utilizing alternative polyA sites have been described in the literature. 
[0358] MLLT3: myeloid/lymphoid or mixed-lineage leukemia (tri thorax homolog, 
Drosophila) 

[0359] Serine and proline rich protein, has a nuclear targeting sequence. 
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[0360] MTHFD1: methylenetetrahydrofolate dehydrogenase (NADP+ dependent), 
methenyltetrahydrofolate cyclohydrolase, formyltetrahydrofolate synthetase 
[0361] This gene encodes a protein that possesses three distinct enzymatic activities, 
5,10-methylenetetrahydrofolate dehydrogenase, 5, 10-methenyltetrahydro folate cyclohydrolase and 
10-formyltetrahydrofolate synthetase. Each of these activities catalyzes one of three sequential 
reactions in the intercon version of 1 -carbon derivatives of tetrahydro folate, which are substrates for 
methionine, thymidylate, and de novo purine syntheses. The trifunctional enzymatic activities are 
conferred by two major domains, an aminoterminal portion containing the dehydrogenase and 
cyclohydrolase activities and a larger synthetase domain. 
[0362] MTMR2 myotubularin related protein 2 (MTMR2) 

[0363] This gene is a member of the myotubularin family and encodes a putative tyrosine 
phosphatase. Mutations in this gene are a cause of Charcot-Marie-Tooth disease type 4B, an 
autosomal recessive demyelinating neuropathy. This gene utilizes multiple polyA signals, only one of 
which has been determined. 

[0364] Muscle specific serine kinase (MSSK1; serine/threonine kinase 23, STK23) 
[0365] Highly similar to SRPK2; may be protein kinase for SR family of RNA splicing factors; 
contains a kinase domain. 

[0366] MVD: mevalonate (diphospho) decarboxylase 

[0367] The enzyme mevalonate pyrophosphate decarboxylase catalyzes the conversion of 
mevalonate pyrophosphate into isopentenyl pyrophosphate in one of the early steps in cholesterol 
biosynthesis. It decarboxylates and dehydrates its substrate while hydrolyzing ATP. 
[0368] MYH11: myosin, heavy polypeptide 11, smooth muscle 

[0369] The protein encoded by this gene is a smooth muscle myosin belonging to the myosin heavy 
chain family. The gene product is a subunit of a hexameric protein that consists of 2 heavy chain 
subunits and 2 pairs of non-identical light chain subunits. It functions as a major contractile protein, 
converting chemical energy into mechanical energy through the hydrolysis of ATP. The gene 
encoding a human ortholog of rat NUDE1 is transcribed from the reverse strand of MYH1 1 gene, and 
its 3' end overlaps with that of the latter. The pericentric inversion of chromosome 16 
[inv(16)(pl3q22)] produces a chimeric transcript consisting of the first 165 residues from the N 
terminus of core-binding factor beta in a fusion with the C-terminal portion of the smooth muscle 
myosin heavy chain. This chromosomal rearrangement is associated with acute myeloid leukemia of 
the M4Eo subtype. Alternative splicing generates isoforms that are differentially expressed, with 
ratios changing during muscle cell maturation. Additional splice variants have been described but 
their full-length nature has not been determined.. 
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[0370] MYH7: myosin, heavy polypeptide 7, cardiac muscle, beta 
[0371] MYH7 encodes the cardiac muscle beta (or slow) isoform of myosin. Changes in the 
relative abundance of MYH7 and MYH6 (the alpha, or fast, isoform of cardiac myosin heavy chain) 
correlate with the contractile velocity of cardiac muscle. Mutations in MYH7 are associated with 
familial hypertrophic cardiomyopathy. 

[0372] NADH dehydrogenase (ubiquinone) 1, alpha subcomplex, 4 (9kD, MLRQ), NDUFA4 
[0373] Subunit of NADH-ubiquinone oxidoreductase (complex I); transports electrons from 
NADH to ubiquinone. 

[0374] NADH-UBIQUINONE OXIDOREDUCTASE CHAIN 5 (EC 1.6.5.3) 

[0375] Subunit of NADH-ubiquinone oxidoreductase (complex I); transports electrons from 

NADH to ubiquinone. 

[0376] NDUFA9: NADH dehydrogenase (ubiquinone) 1 alpha subcomplex, 9 (39kD) 
[0377] NGFB: nerve growth factor, beta polypeptide 

[0378] Nerve growth factor beta; has roles in neuronal differentiation and survival. 

[0379] NGFR: nerve growth factor receptor (TNFR superfamily, member 16) 

[0380] Nerve growth factor receptor contains an extracellular domain containing four 40-amino 

acid repeats with 6 cysteine residues at conserved positions followed by a serine/threonine-rich 

region, a single transmembrane domain, and a 155-amino acid cytoplasmic domain. The cysteine-rich 

region contains the nerve growth factor binding domain. 

[0381] NID2: nidogen 2 

[0382] Nidogen-2; basement membrane protein,. 

[0383] HSU15552: acidic 82 kDa protein mRNA 

[0384] Nonmuscle type myosin heavy chain 9 (MYH9) 

[0385] Non-muscle myosin heavy chain 9; motor protein that provides force for muscle 
contraction, cytokinesis and phagocytosis; contains an ATPase head domain and a rod-like tail 
domain. 

[0386] NPC1 : Niemann-Pick disease, type CI 

[0387] NPC1 was identified as the gene that when mutated, results in Niemann-Pick C disease. 
NPC1 encodes a putative integral membrane protein containing motifs consistent with a role in 
intracellular transport of cholesterol to post-lysosomal destinations. 
[0388] Nth endonuclease Ill-like 1 (NTHL1) 
[0389] Endonuclease; excises damaged pyrimidines. 
[0390] NUCB2: nucleobindin 2 

[0391] Nucleobindin 2; may bind DNA and calcium; has DNA-binding and EF-hand domains, and 
a leucine-zipper. 
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[0392] nuclear receptor subfamily 1, group I, member 2 (NR1I2) 
[0393] The gene product belongs to the nuclear receptor superfamily, members of which are 
transcription factors characterized by a ligand-binding domain and a DNA-binding domain. The 
encoded protein is a transcriptional regulator of the cytochrome P450 gene CYP3A4, binding to the 
response element of the CYP3A4 promoter as a heterodimer with the 9-cis retinoic acid receptor 
RXR. It is activated by a range of compounds that induce CYP3A4, including dexamethasone and 
rifampicin. The gene product contains a zinc finger domain. Three alternatively spliced transcripts 
that encode different isoforms have been described, one of which encodes two products through the 
use of alternative translation initiation codons. Additional transcript variants derived from alternative 
promoter usage, alternative splicing, and/or alternative polyadenylation exist, but they have not been 
fully described. 

[0394] OGDH: oxoglutarate (alpha-ketoglutarate) dehydrogenase (lipoamide) 

[0395] Alpha-ketoglutarate or 2-oxoglutarate dehydrogenase; helps convert a-ketoglutarate to 

succinyl coenzyme A in Krebs cycle. 

[0396] OXCT: 3-oxoacid CoA transferase 

[0397] The mitochondrial matrix enzyme 3-oxoacid CoA transferase is homodimeric. It is a key 
enzyme in the extrahepatic utilization of ketone bodies, catalyzing the reversible transfer of 
coenzyme A from succinyl-CoA to acetoacetate, a necessary step in ketolytic energy production. 
Deficiencies can result in intermittent ketoacidosis. 
[0398] P2RY1: purinergic receptor P2Y, G-protein coupled, 1 

[0399] Purinergic receptor P2Y1, a G protein-coupled receptor; mediates responses to ATP and 
increases inositol phosphate levels. 

[0400] PCCA: propionyl Coenzyme A carboxylase, alpha polypeptide 
[0401] PCCA encodes the alpha subunit of the heterodimeric mitochondrial enzyme Propionyl- 
CoA carboxylase. PCCA encodes the biotin-binding region of this enzyme. Mutations in either PCCA 
or PCCB (encoding the beta subunit) lead to an enzyme deficiency result in propionic acidemia. 
[0402] PDGFB: platelet-derived growth factor beta polypeptide (simian sarcoma viral (v-sis) 
oncogene homolog) 

[0403] The protein encoded by this gene is a member of the platelet-derived growth factor family. 
The four members of this family are mitogenic factors for cells of mesenchymal origin and are 
characterized by a motif of eight cysteines. This gene product can exist either as a homodimer or as a 
heterodimer with the platelet-derived growth factor alpha polypeptide, where the dimers are 
connected by disulfide bonds. Mutations in this gene are associated with meningioma. Reciprocal 
translocations between chromosomes 22 and 7, at sites where this gene and that for COL1 Al are 
located, are associated with a particular type of skin tumor called dermatofibrosarcoma protuberans 
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resulting from unregulated expression of growth factor. Two splice variants have been identified for 
this gene. 

[0404] PERIOD ORCADIAN PROTEIN 2 (KIAA0347) 

[0405] This gene is a member of the Period family of genes and is expressed in a circadian pattern 

in the suprachiasmatic nucleus, the primary circadian pacemaker in the mammalian brain. Genes in 

this family encode components of the circadian rhythms of locomotor activity, metabolism, and 

behavior. Circadian expression in the suprachiasmatic nucleus continues in constant darkness, and a 

shift in the light/dark cycle evokes a proportional shift of gene expression in the suprachiasmatic 

nucleus. The specific function of this gene is not yet known. 

[0406] Peroxisome proliferative activated receptor, delta (PPARD) 

[0407] Peroxisome proliferator-activated receptor delta is a member of the steroid hormone 

receptor superfamily. 

[0408] PGM5: phosphoglucomutase 5 

[0409] Phosphoglucomutase-related (aciculin) putative structural protein; interacts with the 
cytoskeletal proteins dystrophin and utrophin. 
[0410] PLA2G3: phospholipase A2, group III 

[0411] Group in secreted phospholipase A2; calcium-dependent, displays a preference for 
phosphatidylglycerol over phosphatidylcholine. 

[0412] PLA2G4C: phospholipase A2, group IVC (cytosolic, calcium-independent) 

[0413] Group IVC calcium-independent phospholipase a2; hydrolyzes the phospholipid sn-2 ester 

bond; member of the phospholipase family. 

[0414] PLA2G6: phospholipase A2, group VI (cytosolic, calcium-independent) 

[0415] Cytosolic calcium-independent phospholipase_a2; hydrolyzes the phospholipid sn-2 ester 

bond; member of the phospholipase family. 

[0416] PMVK: phosphomevalonate kinase 

[0417] Phosphomevalonate kinase; converts me valonate-5 -phosphate to me valonate-5 -diphosphate. 
[0418] PNMT: phenylethanolamine N-methyltransferase 

[0419] Phenylethanolamine N-methyltransferase; converts norepinephrine to epinephrine. 
[0420] PON1: paraoxonase 1 
[0421] PON2: paraoxonase 2 

[0422] Paraoxonase/arylesterase 2; possibly functions in protecting low density lipoprotein against 
oxidative modification; member of a family that hydrolyzes toxic organophosphates. 
[0423] PPARA: peroxisome proliferative activated receptor, alpha 

[0424] Peroxisome proliferators are a diverse group of chemicals which include hypolipidemic 
drugs, herbicides, leukotriene antagonists, and plasticizers, and are so called because they induce an 



-44- 



AttyDktNo. 1300-0015 
PATENT 



increase in the size and number of peroxisomes. Peroxisomes are subcellular organelles found in 
plants and animals, and contain enzymes for respiration, cholesterol and lipid metabolism. Infact, the 
fibrate class of hypolipidemic drugs is used to reduce triglycerides and cholesterol in patients with 
hyperlipidemia, a major risk factor for coronary heart disease. The action of peroxisome proliferators 
is thought to be mediated via specific receptors belonging to the steroid hormone receptor 
superfamily, called PPARs. Thus far, four closely related subtypes, alpha, beta, gamma and delta, 
have been identified. The subtype PPAR-alpha, encoded by PPARA, is a nuclear transcription factor. 
Upon activation by peroxisome proliferators, it modulates the expression of target genes involved in 
lipid metabolism, suggesting a role for PPAR-alpha in lipid homeostasis.. 
[0425] PPARG: peroxisome proliferative activated receptor, gamma 
[0426] The protein encoded by this gene is a member of the peroxisome proliferator-activated 
receptor (PPAR) subfamily of nuclear receptors. PPARs form heterodimers with retinoid X receptors 
(RXRs) and these heterodimers regulate transcription of various genes. Three subtypes of PPARs are 
known: PPAR-alpha, PPAR-delta, and PPAR-gamma. The protein encoded by this gene is PPAR- 
gamma and is a regulator of adipocyte differentiation. Additionally, PPAR-gamma has been 
implicated in the pathology of numerous diseases including obesity, diabetes, atherosclerosis and 
cancer. Multiple transcript variants that use alternate promoters and splicing have been identified for 
this gene. At least three of these variants encode the same isoform. 

[0427] PPM1A: protein phosphatase 1A (formerly 2C), magnesium-dependent, alpha isoform 
[0428] Magnesium- or manganese-dependent alpha protein phosphatase 1 A; regulates cell stress 
responses. 

[0429] PROBABLE G PROTEIN-COUPLED RECEPTOR APJ 
[0430] PTPRA: protein tyrosine phosphatase, receptor type, A 

[0431] The protein encoded by this gene is a member of the protein tyrosine phosphatase (PTP) 
family. PTPs are known to be signaling molecules that regulate a variety of cellular processes 
including cell growth, differentiation, mitotic cycle, and oncogenic transformation. This PTP contains 
an extracellular domain, a single transmembrane segment and two tandem intracytoplasmic catalytic 
domains, and thus represents a receptor-type PTP. This PTP has been shown to dephosphorylate and 
activate Src family tyrosine kinases, and is implicated in the regulation of integrin signaling, cell 
adhesion and proliferation. Three alternatively spliced variants of this gene, which encode two 
distinct isoforms, have been reported. 

[0432] PYGM: phosphorylase, glycogen; muscle (McArdle syndrome, glycogen storage 
disease type V) 

[0433] Muscle glycogen phosphorylase. 



-45- 



AttyDktNo. 1300-0015 
PATENT 



[0434] RTNlrreticulonl 

[0435] RXRA: retinoid X receptor, alpha 

[0436] Retinoid X receptors (RXRs) and retinoic acid receptors (RARs), are nuclear receptors that 
mediate the biological effects of retinoids by their involvement in retinoic acid-mediated gene 
activation. These receptors exert their action by binding, as homodimers or heterodimers, to specific 
sequences in the promoters of target genes and regulating their transcription. The protein encoded by 
this gene is a member of the steroid and thyroid hormone receptor superfamily of transcriptional 
regulators. 

[0437] RXRB: retinoid X receptor, beta 

[0438] Retinoid X receptor beta; binds to and serves as transcriptional coactivator for retinoic acid. 
[0439] SCA1: spinocerebellar ataxia 1 (olivopontocerebellar ataxia 1, autosomal dominant, 
ataxin 1) 

[0440] The autosomal dominant cerebellar ataxias (ADCA) are a heterogeneous group of 
neurodegenerative disorders characterized by progressive degeneration of the cerebellum, brain stem 
and spinal cord. Clinically, ADCA has been divided into three groups: ADCA types I-in. ADCAI is 
genetically heterogeneous, with five genetic loci, designated spinocerebellar ataxia (SCA) 1, 2, 3, 4 
and 6, being assigned to five different chromosomes. ADCAII, which always presents with retinal 
degeneration (SCA7), and ADCAIII often referred to as the 'pure' cerebellar syndrome (SCA5), are 
most likely homogeneous disorders. Several SCA genes have been cloned and shown to contain CAG 
repeats in their coding regions. ADCA is caused by the expansion of the CAG repeats, producing an 
elongated polyglutamine tract in the corresponding protein. The expanded repeats are variable in size 
and unstable, usually increasing in size when transmitted to successive generations. The function of 
the ataxins is not known. The SCA1 locus has been mapped to chromosome 6, and it has been 
determined that the diseased allele contains 41-81 CAG repeats, compared to 6-39 in the normal 
allele. Several transcript variants of SCA1 in the 5' UTR have been described; however, their full- 
length nature is not known.. 
[0441] SDF1: stromal cell-derived factor 1 

[0442] Stromal cell-derived factor 1; lymphocyte chemoattractant that signals through the receptor 
CXCR4. 

[0443] SERPINA5: serine (or cysteine) proteinase inhibitor, clade A (alpha-l antiproteinase, 
antitrypsin), member 5 

[0444] Protein C inhibitor (plasminogen activator inhibitor HI); may be a serine protease inhibitor; 
member of the serpin family of serine protease inhibitors. 
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[0445] SERPINH1: serine (or cysteine) proteinase inhibitor, clade H (heat shock protein 47), 
member 1, (collagen binding protein 1) 

[0446] Colligin; collagen-binding protein; Similar to HSPs and to serpin family serine protease 
inhibitors. 

[0447] SLC21A6: solute carrier family 21 (organic anion transporter), member 6 
[0448] Organic anion transporter. 

[0449] SLC27A1: solute carrier family 27 (fatty acid transporter), member 1 
[0450] SULT1A2: sulfotransferase family, cytosolic, 1A, phenol-preferring, member 2 
[0451] Phenol-metabolizing sulfotransferase 2; sulfonates simple planar phenols. 
[0452] THBS3: Thrombospondin 3 

[0453] Thrombospondin 3 binds heparin and calcium; similar to murine Thbs3 
[0454] TBP: TATA box binding protein 

[0455] TATA box binding protein, component of the TFIED complex; functions in the initiation of 
mRNA synthesis and basal transcription. 
[0456] TBXA2R: thromboxane A2 receptor 

[0457] Thromboxane A2 receptor (prostaglandin H2 receptor); G protein-coupled receptor, 
activates Ca2+-activated chloride channels; stimulates platelet aggregation and smooth muscle 
constriction. 

[0458] TCF2: transcription factor 2, hepatic; LF-B3; variant hepatic nuclear factor 
[0459] TCF2 encodes transcription factor 2, a liver-specific factor of the homeobox-containing 
basic helix-turn-helix family. The TCF2 protein is believed to form heterodimers with another liver- 
specific member of this transcription factor family, TCF1; depending on the TCF2 isoform, the result 
may be to activate or inhibit transcription of target genes. Mutation of TCF2 that disrupts normal 
function has been identified as the cause of MODY5 (Maturity-Onset of Diabetes, Type 5). A third 
human transcript variant is believed to exist based on such a variant in the rat: however, to date such 
an mRNA species has not been isolated. 
[0460] TETRAN: tetracycline transporter-like protein 
[0461] Similar to E. coli tetracycline resistance efflux protein. 

[0462] TGFB1: transforming growth factor, beta 1 (Camurati-Engelmann disease) 
[0463] Transforming growth factor-beta 1 ; regulates cell proliferation, differentiation, and 
apoptosis. 

[0464] TGFB2: transforming growth factor, beta 2 

[0465] Transforming growth factor-beta 2 (glioblastoma-derived T cell suppressor factor); 
suppresses IL2 - dependent growth of T cells; member of a family of cytokines that transmits signals 
through transmembrane serine/threonine kinases. 
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[0466] TGFB3: transforming growth factor, beta 3 

[0467] Transforming growth factor-beta 3; transmits signals through transmembrane 
serine/threonine kinases, may be required for normal development of the lung and palate; member of 
family of cytokines, very strongly similar to murine Tgfb3. 

[0468] THPO: thrombopoietin (myeloproliferative leukemia virus oncogene ligand, 
megakaryocyte growth and development factor) 

[0469] Thrombopoietin; binds to c-Mpl receptor and regulates megakaryocyte development. 
[0470] TNFAIP2: tumor necrosis factor, alpha-induced protein 2 

[0471] Secreted by vascular endothelium, expression is induced by tumor necrosis factor alpha, 
interleukin-1 beta, and lipopolysaccharide. 
[0472] TRAP1 : heat shock protein 75 

[0473] Heat shock protein 75; binds and refolds denatured RBI during M phase and after heat 
shock; member of the HSP90 family of molecular chaperones. 
[0474] TRIP10: thyroid hormone receptor interactor 10 

[0475] Similar to the non-kinase domains of FER and Fes/Fps tyrosine kinases; binds to activated 
Cdc42 and may regulate actin cytoskeleton; contains an SH3 domain. 
[0476] TXN: thioredoxin 

[0477] Thioredoxin; has dithiol-disulfide oxidoreductase activity. 
[0478] USP6: ubiquitin specific protease 6 (Tre-2 oncogene) 

[0479] Ubiquitin specific protease 6 (Tre-2 oncogene); cleaves ubiquitin from proteins, has 

predicted nucleic acid-binding properties. 

[0480] UTRN: utrophin (homologous to dystrophin) 

[0481] This gene shares both structural and functional similarities with the dystrophin gene. It 
contains an actin-binding N-terminus, a triple coiled-coil repeat central region, and a C-terminus that 
consists of protein-protein interaction motifs which interact with dystroglycan protein components. 
The protein encoded by this gene is located at the neuromuscular synapse and myotendinous 
junctions, where it participates in post-synaptic membrane maintenance and acetylcholine receptor 
clustering. Mouse studies suggest that this gene may serve as a functional substitute for the 
dystrophin gene and therefore, may serve as a potential therapeutic alternative to muscular dystrophy 
which caused by mutations in the dystrophin gene. Alternative splicing of the utrophin gene has been 
described; however, the full-length nature of these variants has not yet been determined. 
[0482] VEGF: vascular endothelial growth factor 

[0483] Vascular endothelial growth factor; induces endothelial cell proliferation and vascular 
permeability. 
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[0484] VEGFB: vascular endothelial growth factor B 

[0485] Vascular endothelial growth factor B; involved in angiogenesis and endothelial cell growth. 
[0486] WISP1: WNT1 inducible signaling pathway protein 1 

[0487] This gene encodes a member of the WNT1 inducible signaling pathway (WISP) protein 
subfamily, which belongs to the connective tissue growth factor (CTGF) family. WNT1 is a member 
of a family of cysteine-rich, glycosylated signaling proteins that mediate diverse developmental 
processes. The CTGF family members are characterized by four conserved cysteine-rich domains: 
insulin-like growth factor-binding domain, von Willebrand factor type C module, thrombospondin 
domain and C-terminal cystine knot-like domain. This gene may be downstream in the WNT1 
signaling pathway that is relevant to malignant transformation. It is expressed at a high level in 
fibroblast cells, and overexpressed in colon tumors. The encoded protein binds to decorin and 
biglycan, two members of a family of small leucine-rich proteoglycans present in the extracellular 
matrix of connective tissue, and possibly prevents the inhibitory activity of decorin and biglycan in 
tumor cell proliferation. It also attenuates p53-mediated apoptosis in response to DNA damage 
through activation of the AJrt kinase. It is 83% identical to the mouse protein at the amino acid level. 
Alternative splicing of this gene generates 2 transcript variants.. 
[0488] XDH: xanthene dehydrogenase 

[0489] Xanthine dehydrogenase belongs to the group of molybdenum-containing hydroxylases 
involved in the oxidative metabolism of purines. The enzyme is a homodimer. Xanthine 
dehydrogenase can be converted to xanthine oxidase by reversible sulfhydryl oxidation or by 
irreversible proteolytic modification. Defects in xanthine dehydrogenase cause xanthinuria, may 
contribute to adult respiratory stress syndrome, and may potentiate influenza infection through an 
oxygen metabolite-dependent mechanism. . 
[0490] YAP1: Yes-associated protein 1, 65 kD 

[0491] Yes-associated protein; binds to the proto-oncoprotein Yes; has a WW domain. 
[0492] PROCR: protein C receptor, endothelial (EPCR) 

[0493] Endothelial Protein C receptor; binds protein C in a calcium-dependent manner; member of 
the CDl/major histocompatibility complex superfamily. 
[0494] STX1A: syntaxin 1A (brain) 

[0495] Syntaxin 1 A (brain); involved in intracellular transport and neurotransmitter release. 
[0496] As SNPs are linked to other SNPs in neighboring genes on a chromosome (Linkage 
Disequilibrium) those SNPs could also be used as marker SNPs. In a recent publication it was shown 
that SNPs are linked over 100 kb in some cases more than 150 kb (Reich D.E. et al. Nature 411, 199- 
204, 2001). Hence SNPs lying in regions neighbouring PA SNPs could be linked to the latter and by 
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this being a diagnostic marker. These associations could be performed as described for the gene 
polymorphism in methods. 

[0497] Methods for Assessing Cardiovascular Status 

[0498] The present invention provides diagnostic methods for assessing cardiovascular status in a 
human individual. Cardiovascular status as used herein refers to the physiological status of an 
individual's cardiovascular system as reflected in one or more markers or indicators. Status markers 
include without limitation clinical measurements such as, e.g., blood pressure, electrocardiographic 
profile, and differentiated blood flow analysis as well as measurements of LDL- and HDL- 
Cholesterol levels, other lipids and other well established clinical parameters that are standard in the 
art. Status markers according to the invention include diagnoses of one or more cardiovascular 
syndromes, such as, e.g., hypertension, acute myocardial infarction, silent myocardial infarction, 
stroke, and atherosclerosis. It will be understood that a diagnosis of a cardiovascular syndrome made 
by a medical practitioner encompasses clinical measurements and medical judgement. Status markers 
according to the invention are assessed using conventional methods well known in the art. Also 
included in the evaluation of cardiovascular status are quantitative or qualitative changes in status 
markers with time, such as would be used, e.g., in the determination of an individual's response to a 
particular therapeutic regimen. 

[0499] The methods are carried out by the steps of: (i) determining the sequence of one or more 
polymorphic positions within one, several or all of the genes listed in Examples or other genes 
mentioned in this file in the individual to establish a polymorphic pattern for the individual; and (ii) 
comparing the polymorphic pattern established in (i) with the polymorphic patterns of humans 
exhibiting different markers of cardiovascular status. The polymorphic pattern of the individual is, 
preferably, highly similar and, most preferably, identical to the polymorphic pattern of individuals 
who exhibit particular status markers, cardiovascular syndromes, and/or particular patterns of 
response to therapeutic interventions. Polymorphic patterns may also include polymorphic positions 
in other genes which are shown, in combination with one or more polymorphic positions in the genes 
listed in the Examples, to correlate with the presence of particular status markers. In one 
embodiment, the method involves comparing an individual's polymorphic pattern with polymorphic 
patterns of individuals who have been shown to respond positively or negatively to a particular 
therapeutic regimen. Therapeutic regimen as used herein refers to treatments aimed at the elimination 
or amelioration of symptoms and events associated cardiovascular disease. Such treatments include 
without limitation one or more of alteration in diet, lifestyle, and exercise regimen; invasive and 
noninvasive surgical techniques such as atherectomy, angioplasty, and coronary bypass surgery; and 
pharmaceutical interventions, such as administration of ACE inhibitors, angiotensin II receptor 
antagonists, diuretics, alpha-adrenoreceptor antagonists, cardiac glycosides, phosphodiesterase 
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inhibitors, beta-adrenoreceptor antagonists, calcium channel blockers, HMG-CoA reductase 
inhibitors, imidazoline receptor blockers, endothelin receptor blockers, organic nitrites, and 
modulators of protein function of genes listed in the Examples. Interventions with pharmaceutical 
agents not yet known whose activity correlates with particular polymorphic patterns associated with 
cardiovascular disease are also encompassed. It is contemplated, for example, that patients who are 
candidates for a particular therapeutic regimen will be screened for polymorphic patterns that 
correlate with responsivity to that particular regimen. 

[0500] In a preferred embodiment, the method involves comparing an individual's polymorphic 
pattern with polymorphic patterns of individuals who exhibit or have exhibited one or more markers 
of cardiovascular disease, such as, e.g., elevated LDL-Cholesterol levels, high blood pressure, 
abnormal electrocardiographic profile, myocardial infarction, stroke, or atherosclerosis. 
[0501] In another embodiement, the method involves comparing an individual's polymorphic 
pattern with polymorphic patterns of individuals who exhibit or have exhibited one or more drug 
related phenotypes, such as, e.g., low or high drug response, or adverse drug reactions. 
[0502] In practicing the methods of the invention, an individual's polymorphic pattern can be 
established by obtaining DNA from the individual and determining the sequence at predetermined 
polymorphic positions in the genes such as those described in this file. 

[0503] The DNA may be obtained from any cell source. Non-limiting examples of cell sources 
available in clinical practice include blood cells, buccal cells, cervicovaginal cells, epithelial cells 
from urine, fetal cells, or any cells present in tissue obtained by biopsy. Cells may also be obtained 
from body fluids, including without limitation blood, saliva, sweat, urine, cerebrospinal fluid, feces, 
and tissue exudates at the site of infection or inflammation. DNA is extracted from the cell source or 
body fluid using any of the numerous methods that are standard in the art. It will be understood that 
the particular method used to extract DNA will depend on the nature of the source. 
[0504] Diagnostic and Prognostic Assays 

[0505] The present invention provides methods for determining the molecular structure of at least 
one polymorphic region of a gene, specific allelic variants of said polymorphic region being 
associated with cardiovascular disease. In one embodiment, determining the molecular structure of a 
polymorphic region of a gene comprises determining the identity of the allelic variant. A polymorphic 
region of a gene, of which specific alleles are associated with cardiovascular disease can be located in 
an exon, an intron, at an intron/exon border, or in the promoter of the gene. 
[0506] The invention provides methods for determining whether a subject has, or is at risk, of 
developing a cardiovascular disease. Such disorders can be associated with an aberrant gene activity, 
e.g., abnormal binding to a form of a lipid, or an aberrant gene protein level. An aberrant gene protein 
level can result from an aberrant transcription or post-transcriptional regulation. Thus, allelic 
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differences in specific regions of a gene can result in differences of gene protein due to differences in 
regulation of expression. In particular, some of the identified polymorphisms in the human gene may 
be associated with differences in the level of transcription, RNA maturation, splicing, or translation 
of the gene or transcription product. 

[0507] In preferred embodiments, the methods of the invention can be characterized as comprising 
detecting, in a sample of cells from the subject, the presence or absence of a specific allelic variant of 
one or more polymorphic regions of a gene. The allelic differences can be: (i) a difference in the 
identity of at least one nucleotide or (ii) a difference in the number of nucleotides, which difference 
can be a single nucleotide or several nucleotides. 

[0508] A preferred detection method is allele specific hybridization using probes overlapping the 
polymorphic site and having about 5, 10, 20, 25, or 30 nucleotides around the polymorphic region. 
Examples of probes for detecting specific allelic variants of the polymorphic region located in intron 
X are probes comprising a nucleotide sequence set forth in any of SEQ ID NO. X. In a preferred 
embodiment of the invention, several probes capable of hybridizing specifically to allelic variants are 
attached to a solid phase support, e.g., a "chip." Oligonucleotides can be bound to a solid support by a 
variety of processes, including lithography. For example a chip can hold up to 250,000 
oligonucleotides (GeneChip, Affymetrix). Mutation detection analysis using these chips comprising 
oligonucleotides, also termed "DNA probe arrays" is described e.g., in Cronin et al., Human 
Mutation 7:244 (1996)and in Kozal et al., Nature MEDICINE 2:753 (1996). In one embodiment, a 
chip comprises all the allelic variants of at least one polymorphic region of a gene. The solid phase 
support is then contacted with a test nucleic acid and hybridization to the specific probes is detected. 
Accordingly, the identity of numerous allelic variants of one or more genes can be identified in a 
simple hybridization experiment. For example, the identity of the allelic variant of the nucleotide 
polymorphism of nucleotide A or G at position 33 of Seq ID 1 (baySNP179) and that of other 
possible polymorphic regions can be determined in a single hybridization experiment. 
[0509] In other detection methods, it is necessary to first amplify at least a portion of a gene prior 
to identifying the allelic variant. Amplification can be performed, e.g., by PCR and/or LCR, 
according to methods known in the art. In one embodiment, genomic DNA of a cell is exposed to two 
PCR primers and amplification for a number of cycles sufficient to produce the required amount of 
amplified DNA. In preferred embodiments, the primers are located between 40 and 350 base pairs 
apart. Preferred primers for amplifying gene fragments of genes of this file are listed in Table 2 in 
the Examples. 

[0510] Alternative amplification methods include: self sustained sequence replication (Guatelli, 
J.C. et al., PROC. Natl. Acad. Sci. USA. 87:1874-1878 (1990)), transcriptional amplification system 
(Kwoh, D.Y. et al., PROC. NATL. ACAD. SCI. USA 86:1 173-1 177 (1989)), Q-Beta Replicase (Lizardi, 
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P.M. et al., Bio/Technology 6:1 197 (1988)), or any other nucleic acid amplification method, 
followed by the detection of the amplified molecules using techniques well known to those of skill in 
the art. These detection schemes are especially useful for the detection of nucleic acid molecules if 
such molecules are present in very low numbers. 

[0511] In one embodiment, any of a variety of sequencing reactions known in the art can be used to 
directly sequence at least a portion of a gene and detect allelic variants, e.g., mutations, by comparing 
the sequence of the sample sequence with the corresponding wild-type (control) sequence. Exemplary 
sequencing reactions include those based on techniques developed by Maxam and Gilbert (PROC. 
NATL ACAD SCI USA 74:560 (1977)) or Sanger (Sanger et al., PROC. Nat. ACAD. SCI 74:5463 
(1977)). It is also contemplated that any of a variety of automated sequencing procedures may be 
utilized when performing the subject assays (BIOTECHNIQUES 19:448 (1995)), including sequencing 
by mass spectrometry (see, for example, U.S. Pat. No. 5,547,835 and international patent application 
Publication Number WO 94/16101, entitled DNA Sequencing by Mass Spectrometry by H. Koster; 
U.S. Pat. No. 5,547,835 and international patent application Publication Number WO 94/21822 
entitled DNA Sequencing by Mass Spectrometry Via Exonuclease Degradation by H. Koster), and 
U.S. Pat. No. 5,605,798 and International Patent Application No. PCT/US96/03651 entitled DNA 
Diagnostics Based on Mass Spectrometry by H. Koster; Cohen et al., ADV CHROMATOGR 36: 127-162 
(1996); and Griffin et al., APPL BlOCHEM BlOTECHNOL 38:147-159 (1993)). It will be evident to one 
skilled in the art that, for certain embodiments, the occurrence of only one, two or three of the nucleic 
acid bases need be determined in the sequencing reaction. For instance, A-track or the like, e.g., 
where only one nucleotide is detected, can be carried out. 

[0512] Yet other sequencing methods are disclosed, e.g., in U.S. Pat. No. 5,580,732 entitled 
Method of DNA sequencing employing a mixed DNA-polymer chain probe and U.S. Pat. No. 
5,571,676 entitled Method for mismatch-directed in vitro DNA sequencing. 
[0513] In some cases, the presence of a specific allele of a gene in DNA from a subject can be 
shown by restriction enzyme analysis. For example, a specific nucleotide polymorphism can result in 
a nucleotide sequence comprising a restriction site which is absent from the nucleotide sequence of 
another allelic variant. 

[0514] In other embodiments, alterations in electrophoretic mobility is used to identify the type of 
gene allelic variant. For example, single strand conformation polymorphism (SSCP) may be used to 
detect differences in electrophoretic mobility between mutant and wild type nucleic acids (Orita et 
al., PROC Natl. Acad. SciUSA 86:2766 (1989), see also, Cotton, MutatRes 285:125-144 (1993); 
and Hayashi, GENET Anal TECH APPL 9:73-79 (1992)). Single-stranded DNA fragments of sample 
and control nucleic acids are denatured and allowed to renature. The secondary structure of single- 
stranded nucleic acids varies according to sequence, the resulting alteration in electrophoretic 
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mobility enables the detection of even a single base change. The DNA fragments may be labeled or 
detected with labeled probes. The sensitivity of the assay may be enhanced by using RNA (rather 
than DNA), in which the secondary structure is more sensitive to a change in sequence. In another 
preferred embodiment, the subject method utilizes heteroduplex analysis to separate double stranded 
heteroduplex molecules on the basis of changes in electrophoretic mobility (Keen et al., TRENDS 
Genet 7:5 (1991)). 

[0515] In yet another embodiment, the identity of an allelic variant of a polymorphic region is 
obtained by analyzing the movement of a nucleic acid comprising the polymorphic region in 
polyacrylamide gels containing a gradient of denaturant is assayed using denaturing gradient gel 
electrophoresis (DGGE) (Myers et al., NATURE 313:495 (1985)). When DGGE is used as the method 
of analysis, DNA will be modified to insure that it does not completely denature, for example by 
adding a GC clamp of approximately 40 bp of high-melting GC-rich DNA by PCR. In a further 
embodiment, a temperature gradient is used in place of a denaturing agent gradient to identify 
differences in the mobility of control and sample DNA (Rosenbaum and Reissner, BlOPHYS Chem 
265:1275 (1987)). 

[0516] Examples of techniques for detecting differences of at least one nucleotide between 2 
nucleic acids include, but are not limited to, selective oligonucleotide hybridization, selective 
amplification, or selective primer extension. For example, oligonucleotide probes may be prepared in 
which the known polymorphic nucleotide is placed centrally (allele-specific probes) and then 
hybridized to target DNA under conditions which permit hybridization only if a perfect match is 
found (Saiki et al., NATURE 324: 163 (1986)); Saiki et al PROC. Natl Acad. SCI USA 86:6230 
(1989); and Wallace et al, NUCL. ACIDS RES. 6:3543 (1979)). Such allele specific oligonucleotide 
hybridization techniques may be used for the simultaneous detection of several nucleotide changes in 
different polymorphic regions of gene. For example, oligonucleotides having nucleotide sequences of 
specific allelic variants are attached to a hybridizing membrane and this membrane is then hybridized 
with labeled sample nucleic acid. Analysis of the hybridization signal will then reveal the identity of 
the nucleotides of the sample nucleic acid. 

[0517] Alternatively, allele specific amplification technology which depends on selective PCR 
amplification may be used. Oligonucleotides used as primers for specific amplification may carry the 
allelic variant of interest in the center of the molecule (so that amplification depends on differential 
hybridization) (Gibbs et al., NUCLEIC ACIDS RES. 17:2437-2448 (1989)) or at the extreme 3' end of 
one primer where, under appropriate conditions, mismatch can prevent, or reduce polymerase 
extension (Prossner, TlBTECH 1 1:238 (1993); Newton et al., NUCL. ACIDS RES. 17:2503 (1989)). This 
technique is also termed "PROBE" for Probe Oligo Base Extension. In addition it may be desirable to 
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introduce a novel restriction site in the region of the mutation to create cleavage-based detection 
(Gasparini et al., MOL. CELL PROBES 6:1 (1992)). 

[0518] In another embodiment, identification of the allelic variant is carried out using an 
oligonucleotide ligation assay (OLA), as described, e.g., in U.S. Pat. No. 4,998,617 and in Landegren, 
U. et al., Science 241:1077-1080 (1988). The OLA protocol uses two oligonucleotides which are 
designed to be capable of hybridizing to abutting sequences of a single strand of a target. One of the 
oligonucleotides is linked to a separation marker, e.g,. biotinylated, and the other is detectably 
labeled. If the precise complementary sequence is found in a target molecule, the oligonucleotides 
will hybridize such that their termini abut, and create a ligation substrate. Ligation then permits the 
labeled oligonucleotide to be recovered using avidin, or another biotin ligand. Nickerson, D. A. et al. 
have described a nucleic acid detection assay that combines attributes of PCR and OLA (Nickerson, 
D.A. et al., PROC. Natl. Acad. SCI. USA. 87:8923-8927 (1990). In this method, PCR is used to 
achieve the exponential amplification of target DNA, which is then detected using OLA. 
[0519] Several techniques based on this OLA method have been developed and can be used to 
detect specific allelic variants of a polymorphic region of a gene. For example, U.S. Pat. No. 
5,593,826 discloses an OLA using an oligonucleotide having 3-amino group and a 5'-phosphorylated 
oligonucleotide to form a conjugate having a phosphoramidate linkage. In another variation of OLA 
described in Tobe et al., NUCLEIC ACIDS RES 24:3728 (1996), OLA combined with PCR permits 
typing of two alleles in a single microtiter well. By marking each of the allele-specific primers with a 
unique hapten, i.e. digoxigenin and fluorescein, each LA reaction can be detected by using hapten 
specific antibodies that are labeled with different enzyme reporters, alkaline phosphatase or 
horseradish peroxidase. This system permits the detection of the two alleles using a high throughput 
format that leads to the production of two different colors. 

[0520] The invention further provides methods for detecting single nucleotide polymorphisms in a 
gene. Because single nucleotide polymorphisms constitute sites of variation flanked by regions of 
invariant sequence, their analysis requires no more than the determination of the identity of the single 
nucleotide present at the site of variation and it is unnecessary to determine a complete gene sequence 
for each patient. Several methods have been developed to facilitate the analysis of such single 
nucleotide polymorphisms. 

[0521] In one embodiment, the single base polymorphism can be detected by using a specialized 
exonuclease-resistant nucleotide, as disclosed, e.g., in Mundy, C. R. (U.S. Pat. No. 4,656,127). 
According to the method, a primer complementary to the allelic sequence immediately 3' to the 
polymorphic site is permitted to hybridize to a target molecule obtained from a particular animal or 
human. If the polymorphic site on the target molecule contains a nucleotide that is complementary to 
the particular exonuclease-resistant nucleotide derivative present, then that derivative will be 
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incorporated onto the end of the hybridized primer. Such incorporation renders the primer resistant to 
exonuclease, and thereby permits its detection. Since the identity of the exonuclease-resistant 
derivative of the sample is known, a finding that the primer has become resistant to exonucleases 
reveals that the nucleotide present in the polymorphic site of the target molecule was complementary 
to that of the nucleotide derivative used in the reaction. This method has the advantage that it does 
not require the determination of large amounts of extraneous sequence data. 
[0522] In another embodiment of the invention, a solution-based method is used for determining 
the identity of the nucleotide of a polymorphic site. Cohen, D. et al. (French Patent 2,650,840; PCT 
Appln. No. WO91/02087). As in the Mundy method of U.S. Pat. No. 4,656,127, a primer is employed 
that is complementary to allelic sequences immediately 3' to a polymorphic site. The method 
determines the identity of the nucleotide of that site using labeled dideoxynucleotide derivatives, 
which, if complementary to the nucleotide of the polymorphic site will become incorporated onto the 
terminus of the primer. 

[0523] An alternative method, known as Genetic Bit Analysis or GBA TM is described by Goelet, 
P. et al. (PCT Appln. No. 92/15712). The method of Goelet, P. et al. uses mixtures of labeled 
terminators and a primer that is complementary to the sequence 3' to a polymorphic site. The labeled 
terminator that is incorporated is thus determined by, and complementary to, the nucleotide present in 
the polymorphic site of the target molecule being evaluated. In contrast to the method of Cohen et al. 
(French Patent 2,650,840; PCT Appln. No. W09 1/02087) the method of Goelet, P. et al. is preferably 
a heterogeneous phase assay, in which the primer or the target molecule is immobilized to a solid 
phase. 

[0524] Recently, several primer-guided nucleotide incorporation procedures for assaying 
polymorphic sites in DNA have been described (Komher, J.S. et al., NUCL. ACIDS. RES. 17:7779- 
7784 (1989); Sokolov, B.P., NUCL. ACIDS RES. 18:3671 (1990); Syvanen, A.C. et al, Genomics 
8:684-692 (1990), Kuppuswamy, M.N. et al., PROC. NATL. ACAD. SCI. USA 88:1 143-1 147 (1991); 
Prezant, T.R. et al. Hum. Mutat. 1:159-164 (1992); Ugozzoli, L. et al, GATA 9:107-1 12 (1992); 
Nyren, P. et al, ANAL. BlOCHEM. 208:171-175 (1993)). These methods differ from GBA TM in that 
they all rely on the incorporation of labeled deoxynucleotides to discriminate between bases at a 
polymorphic site. In such a format, since the signal is proportional to the number of deoxynucleotides 
incorporated, polymorphisms that occur in runs of the same nucleotide can result in signals that are 
proportional to the length of the run (Syvanen, A.C, et al., AMER. J. HUM. GENET. 52:46-59 (1993)). 
[0525] For determining the identity of the allelic variant of a polymorphic region located in the 
coding region of a gene, yet other methods than those described above can be used. For example, 
identification of an allelic variant which encodes a mutated gene protein can be performed by using 
an antibody specifically recognizing the mutant protein in, e.g., immunohistochemistry or 
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immunoprecipitation. Antibodies to wild-type gene protein are described, e.g., in Acton et al., 
Science 271:518 (1999) (anti-mouse gene antibody cross-reactive with human gene). Other 
antibodies to wild-type gene or mutated forms of gene proteins can be prepared according to methods 
known in the art. Alternatively, one can also measure an activity of an gene protein, such as binding 
to a lipid or lipoprotein. Binding assays are known in the art and involve, e.g., obtaining cells from a 
subject, and performing binding experiments with a labeled lipid, to determine whether binding to the 
mutated form of the receptor differs from binding to the wild-type of the receptor. 
[0526] If a polymorphic region is located in an exon, either in a coding or non-coding region of the 
gene, the identity of the allelic variant can be determined by determining the molecular structure of 
the mRNA, pre-mRNA, or cDNA. The molecular structure can be determined using any of the above 
described methods for determining the molecular structure of the genomic DNA, e.g., sequencing and 
SSCP. 

[0527] The methods described herein may be performed, for example, by utilizing pre-packaged 
diagnostic kits, such as those described above, comprising at least one probe or primer nucleic acid 
described herein, which may be conveniently used, e.g., to determine whether a subject has or is at 
risk of developing a disease associated with a specific gene allelic variant. 

[0528] Sample nucleic acid for using in the above-described diagnostic and prognostic methods 
can be obtained from any cell type or tissue of a subject. For example, a subjects bodily fluid (e.g., 
blood) can be obtained by known techniques (e.g., venipuncture) or from human tissues like heart 
(biopsies, transplanted organs). Alternatively, nucleic acid tests can be performed on dry samples 
(e.g., hair or skin). Fetal nucleic acid samples for prenatal diagnostics can be obtained from maternal 
blood as described in International Patent Application No.WO91/07660 to Bianchi. Alternatively, 
amniocytes or chorionic villi may be obtained for performing prenatal testing. 
[0529] Diagnostic procedures may also be performed in situ directly upon tissue sections (fixed 
and/or frozen) of patient tissue obtained from biopsies or resections, such that no nucleic acid 
purification is necessary. Nucleic acid reagents may be used as probes and/or primers for such in situ 
procedures (see, e.g., Nuovo, G. J., PCR rN SITU HYBRIDIZATION: PROTOCOLS AND APPLICATIONS 
(Raven Press, New York, 1992)). 

[0530] In addition to methods which focus primarily on the detection of one nucleic acid sequence, 
profiles may also be assessed in such detection schemes. Fingerprint profiles may be generated, for 
example, by utilizing a differential display procedure, Northern analysis and/or RT-PCR. 
[0531] In practicing the present invention, the distribution of polymorphic patterns in a large 
number of individuals exhibiting particular markers of cardiovascular status or drug response is 
determined by any of the methods described above, and compared with the distribution of 
polymorphic patterns in patients that have been matched for age, ethnic origin, and/or any other 
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statistically or medically relevant parameters, who exhibit quantitatively or qualitatively different 
status markers. Correlations are achieved using any method known in the art, including nominal 
logistic regression, chi square tests or standard least squares regression analysis. In this manner, it is 
possible to establish statistically significant correlations between particular polymorphic patterns and 
particular cardiovascular statuses (given in p values). It is further possible to establish statistically 
significant correlations between particular polymorphic patterns and changes in cardiovascular status 
or drug response such as, would result, e.g., from particular treatment regimens. In this manner, it is 
possible to correlate polymorphic patterns with responsivity to particular treatments. 
[0532] In another embodiment of the present invention two or more polymorphic regions are 
combined to define so called 'haplotypes.' Haplotypes are groups of two or more SNPs that are 
functionally and/or spatially linked. It is possible to combine SNPs that are disclosed in the present 
invention either with each other or with additional polymorphic regions to form a haplotype. 
Haplotypes are expected to give better predictive/diagnostic information than a single SNP. 
[0533] In a preferred embodiment of the present invention a panel of SNPs/haplotypes is defined 
that predicts the risk for CVD or drug response. This predictive panel is then used for genotyping of 
patients on a platform that can genotype multiple SNPs at the same time (Multiplexing). Preferred 
platforms are e.g., gene chips (Affymetrix) or the Luminex LabMAP reader. The subsequent 
identification and evaluation of a patient's haplotype can then help to guide specific and 
individualized therapy. 

[0534] For example the present invention can identify patients exhibiting genetic polymorphisms 
or haplotypes which indicate an increased risk for adverse drug reactions. In that case the drug dose 
should be lowered in a way that the risk for ADR is diminished. Also if the patient's response to drug 
administration is particularly high (or the patient is badly metabolizing the drug), the drug dose 
should be lowered to avoid the risk of ADR. 

[0535] In turn if the patient's response to drug administration is low (or the patient is a particularly 
high metabolizer of the drug), and there is no evident risk of ADR, the drug dose should be raised to 
an efficacious level. 

[0536] It is self evident that the ability to predict a patient's individual drug response should affect 
the formulation of a drug, i.e. drug formulations should be tailored in a way that they suit the 
different patient classes (low/high responder, poor/good metabolizer, ADR prone patients). Those 
different drug formulations may encompass different doses of the drug, i.e. the medicinal products 
contains low or high amounts of the active substance. In another embodiement of the invention the 
drug formulation may contain additional substances that facilitate the beneficial effects and/or 
diminish the risk for ADR (Folkers et al. 1991, US Pat. 5,316,765). 
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[0537] Isolated Polymorphic Nucleic Acids, Probes, and Vectors 

[0538] The present invention provides isolated nucleic acids comprising the polymorphic positions 

described herein for human genes; vectors comprising the nucleic acids; and transformed host cells 

comprising the vectors. The invention also provides probes which are useful for detecting these 

polymorphisms. 

[0539] In practicing the present invention, many conventional techniques in molecular biology, 
microbiology, and recombinant DNA, are used. Such techniques are well known and are explained 
fully in, for example, Sambrook et al., Molecular Cloning: A Laboratory Manual, 2 nd ed. 
(Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York 1989); DNA CLONrNG: A 
Practical Approach vols. I and n (D.N. Glover ed. 1985); Oligonucleotide Synthesis 
(M.L.Gait ed. 1984); NUCLEIC ACID HYBRIDIZATION, (Hames and Higgins 1985); Ausubel et al., 
Current Protocols in Molecular Biology (John Wiley and Sons 1997); and Methods in 
ENZYMOLOGY vols. 154 and 155 (Wu and Grossman, and Wu, eds., respectively). 
[0540] Insertion of nucleic acids (typically DNAs) comprising the sequences in a functional 
surrounding like full length cDNA of the present invention into a vector is easily accomplished when 
the termini of both the DNAs and the vector comprise compatible restriction sites. If this cannot be 
done, it may be necessary to modify the termini of the DNAs and/or vector by digesting back single- 
stranded DNA overhangs generated by restriction endonuclease cleavage to produce blunt ends, or to 
achieve the same result by filling in the single-stranded termini with an appropriate DNA polymerase. 
[0541] Alternatively, any site desired may be produced, e.g., by ligating nucleotide sequences 
(linkers) onto the termini. Such linkers may comprise specific oligonucleotide sequences that define 
desired restriction sites. Restriction sites can also be generated by the use of the polymerase chain 
reaction (PCR). See, e.g., Saiki et al., SCIENCE 239:48 (1988). The cleaved vector and the DNA 
fragments may also be modified if required by homopolymeric tailing. 

[0542] The nucleic acids may be isolated directly from cells or may be chemically synthesized 
using known methods. Alternatively, the polymerase chain reaction (PCR) method can be used to 
produce the nucleic acids of the invention, using either chemically synthesized strands or genomic 
material as templates. Primers used for PCR can be synthesized using the sequence information 
provided herein and can further be designed to introduce appropriate new restriction sites, if 
desirable, to facilitate incorporation into a given vector for recombinant expression. 
[0543] The nucleic acids of the present invention may be flanked by native gene sequences, or may 
be associated with heterologous sequences, including promoters, enhancers, response elements, 
signal sequences, polyadenylation sequences, introns, 5 - and 3'-noncoding regions, and the like. The 
nucleic acids may also be modified by many means known in the art. Non-limiting examples of such 
modifications include methylation, "caps," substitution of one or more of the naturally occurring 
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nucleotides with an analog, internucleotide modifications such as, for example, those with uncharged 
linkages (e.g., methyl phosphonates, phosphotriesters, phosphoroamidates, carbamates, morpholines 
etc.) and with charged linkages (e.g., phosphorothioates, phosphorodithioates, etc.). Nucleic acids 
may contain one or more additional covalently linked moieties, such as, for example, proteins (e.g., 
nucleases, toxins, antibodies, signal peptides, poly-L-lysine, etc.), intercalators (e.g., acridine, 
psoralen, etc.), chelators (e.g., metals, radioactive metals, iron, oxidative metals, etc.), and alkylators. 
PNAs are also included. The nucleic acid may be derivatized by formation of a methyl or ethyl 
phosphotriester or an alkyl phosphoramidate linkage. Furthermore, the nucleic acid sequences of the 
present invention may also be modified with a label capable of providing a detectable signal, either 
directly or indirectly. Exemplary labels include radioisotopes, fluorescent molecules, biotin, and the 
like. 

[0544] The invention also provides nucleic acid vectors comprising the gene sequences or 
derivatives or fragments thereof of genes described in the Examples. A large number of vectors, 
including plasmid and fungal vectors, have been described for replication and/or expression in a 
variety of eukaryotic and prokaryotic hosts, and may be used for gene therapy as well as for simple 
cloning or protein expression. Non-limiting examples of suitable vectors include without limitation 
pUC plasmids, pET plasmids (Novagen, Inc., Madison, Wis.), or pRSET or pREP (Invitrogen, San 
Diego, Calif), and many appropriate host cells, using methods disclosed or cited herein or otherwise 
known to those skilled in the relevant art. The particular choice of vector/host is not critical to the 
practice of the invention. 

[0545] Suitable host cells may be transformed/transfected/infected as appropriate by any suitable 
method including electroporation, CaCl 2 mediated DNA uptake, fungal or viral infection, 
microinjection, microprojectile, or other established methods. Appropriate host cells included 
bacteria, archebacteria, fungi, especially yeast, and plant and animal cells, especially mammalian 
cells. A large number of transcription initiation and termination regulatory regions have been isolated 
and shown to be effective in the transcription and translation of heterologous proteins in the various 
hosts. Examples of these regions, methods of isolation, manner of manipulation, etc. are known in the 
art. Under appropriate expression conditions, host cells can be used as a source of recombinantly 
produced peptides and polypeptides encoded by genes of the Examples. Nucleic acids encoding 
peptides or polypeptides from gene sequences of the Examples may also be introduced into cells by 
recombination events. For example, such a sequence can be introduced into a cell and thereby effect 
homologous recombination at the site of an endogenous gene or a sequence with substantial identity 
to the gene. Other recombination-based methods such as non-homologous recombinations or deletion 
of endogenous genes by homologous recombination may also be used. 
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[0546] In case of proteins that form heterodimers or other multimers, both or all subunits have to 
be expressed in one system or cell. 

[0547] The nucleic acids of the present invention find use as probes for the detection of genetic 
polymorphisms and as templates for the recombinant production of normal or variant peptides or 
polypeptides encoded by genes listed in the Examples. 

[0548] Probes in accordance with the present invention comprise without limitation isolated 
nucleic acids of about 10-100 bp, preferably 15-75 bp and most preferably 17-25 bp in length, which 
hybridize at high stringency to one or more of the polymorphic sequences disclosed herein or to a 
sequence immediately adjacent to a polymorphic position. Furthermore, in some embodiments a full- 
length gene sequence may be used as a probe. In one series of embodiments, the probes span the 
polymorphic positions in genes disclosed herein. In another series of embodiments, the probes 
correspond to sequences immediately adjacent to the polymorphic positions. 
[0549] Polymorphic Polypeptides and Polymorphism-Specific Antibodies 
[0550] The present invention encompasses isolated peptides and polypeptides encoded by genes 
listed in the Examples comprising polymorphic positions disclosed herein. In one preferred 
embodiment, the peptides and polypeptides are useful screening targets to identify cardiovascular 
drugs. In another preferred embodiments, the peptides and polypeptides are capable of eliciting 
antibodies in a suitable host animal that react specifically with a polypeptide comprising the 
polymorphic position and distinguish it from other polypeptides having a different sequence at that 
position. 

[0551] Polypeptides according to the invention are preferably at least five or more residues in 
length, preferably at least fifteen residues. Methods for obtaining these polypeptides are described 
below. Many conventional techniques in protein biochemistry and immunology are used. Such 
techniques are well known and are explained in IMMUNOCHEMICAL METHODS IN CELL AND 
Molecular Biology (Mayer and Waler eds., Academic Press, London 1987); Scopes, PROTEIN 
Purification: Principles and Practice, Second Edition (Springer-Verlag, N.Y. 1987); and 
Handbook of Experimental Immunology, vols. I to IV (Weir and Blackwell eds., 1986). 
[0552] Nucleic acids comprising protein-coding sequences can be used to direct the ITT 
recombinant expression of polypeptides encoded by genes disclosed herein in intact cells or in cell- 
free translation systems. The known genetic code, tailored if desired for more efficient expression in 
a given host organism, can be used to synthesize oligonucleotides encoding the desired amino acid 
sequences. The polypeptides may be isolated from human cells, or from heterologous organisms or 
cells (including, but not limited to, bacteria, fungi, insect, plant, and mammalian cells) into which an 
appropriate protein-coding sequence has been introduced and expressed. Furthermore, the 
polypeptides may be part of recombinant fusion proteins. 
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[0553] Peptides and polypeptides may be chemically synthesized by commercially available 
automated procedures, including, without limitation, exclusive solid phase synthesis, partial solid 
phase methods, fragment condensation or classical solution synthesis. The polypeptides are 
preferably prepared by solid phase peptide synthesis as described by Merrifield, J. Am. Chem. SOC. 
85:2149(1963). 

[0554] Methods for polypeptide purification are well-known in the art, including, without 
limitation, preparative disc-gel electrophoresis, isoelectric focusing, HPLC, reversed-phase HPLC, 
gel filtration, ion exchange and partition chromatography, and countercurrent distribution. For some 
purposes, it is preferable to produce the polypeptide in a recombinant system in which the protein 
contains an additional sequence tag that facilitates purification, such as, but not limited to, a 
polyhistidine sequence. The polypeptide can then be purified from a crude lysate of the host cell by 
chromatography on an appropriate solid-phase matrix. Alternatively, antibodies produced against 
peptides encoded by genes disclosed herein, can be used as purification reagents. Other purification 
methods are possible. 

[0555] The present invention also encompasses derivatives and homologues of the polypeptides. 
For some purposes, nucleic acid sequences encoding the peptides may be altered by substitutions, 
additions, or deletions that provide for functionally equivalent molecules, i.e., function-conservative 
variants. For example, one or more amino acid residues within the sequence can be substituted by 
another amino acid of similar properties, such as, for example, positively charged amino acids 
(arginine, lysine, and histidine); negatively charged amino acids (aspartate and glutamate); polar 
neutral amino acids; and non-polar amino acids. 

[0556] The isolated polypeptides may be modified by, for example, phosphorylation, sulfation, 
acylation, or other protein modifications. They may also be modified with a label capable of 
providing a detectable signal, either directly or indirectly, including, but not limited to, radioisotopes 
and fluorescent compounds. 

[0557] The present invention also encompasses antibodies that specifically recognize the 
polymorphic positions of the invention and distinguish a peptide or polypeptide containing a 
particular polymorphism from one that contains a different sequence at that position. Such 
polymorphic position-specific antibodies according to the present invention include polyclonal and 
monoclonal antibodies. The antibodies may be elicited in an animal host by immunization with 
peptides encoded by genes disclosed herein or may be formed by in vitro immunization of immune 
cells. The immunogenic components used to elicit the antibodies may be isolated from human cells or 
produced in recombinant systems. The antibodies may also be produced in recombinant systems 
programmed with appropriate antibody-encoding DNA. Alternatively, the antibodies may be 
constructed by biochemical reconstitution of purified heavy and light chains. The antibodies include 
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hybrid antibodies (i.e., containing two sets of heavy chain/light chain combinations, each of which 
recognizes a different antigen), chimeric antibodies (i.e., in which either the heavy chains, light 
chains, or both, are fusion proteins), and univalent antibodies (i.e., comprised of a heavy chain/light 
chain complex bound to the constant region of a second heavy chain). Also included are Fab 
fragments, including Fab 1 and F(ab).sub.2 fragments of antibodies. Methods for the production of all 
of the above types of antibodies and derivatives are well-known in the art and are discussed in more 
detail below. For example, techniques for producing and processing polyclonal antisera are disclosed 
in Mayer and Walker, IMMUNOCHEMICAL METHODS IN CELL AND MOLECULAR BIOLOGY (Academic 
Press, London 1987). The general methodology for making monoclonal antibodies by hybridomas is 
well known. Immortal antibody-producing cell lines can be created by cell fusion, and also by other 
techniques such as direct transformation of B lymphocytes with oncogenic DNA, or transfection with 
Epstein-Barr virus. See, e.g., Schreier et al., HYBRIDOMA TECHNIQUES (1980); U.S. Pat. Nos. 
4,341,761; 4,399,121; 4,427,783; 4,444,887; 4,466,917; 4,472,500; 4,491,632; and 4,493,890. Panels 
of monoclonal antibodies produced against peptides encoded by genes disclosed herein can be 
screened for various properties; i.e. for isotype, epitope affinity, etc. 

[0558] The antibodies of this invention can be purified by standard methods, including but not 
limited to preparative disc-gel electrophoresis, isoelectric focusing, HPLC, reversed-phase HPLC, gel 
filtration, ion exchange and partition chromatography, and countercurrent distribution. Purification 
methods for antibodies are disclosed, e.g., in The Art of Antibody Purification (Amicon 
Division, W. R. Grace & Co. 1989). General protein purification methods are described in Protein 
Purification: Principles and Practice (R. K. Scopes ed., Springer-Verlag, New York, N.Y. 
1987). 

[0559] Methods for determining the immunogenic capability of the disclosed sequences and the 
characteristics of the resulting sequence-specific antibodies and immune cells are well-known in the 
art. For example, antibodies elicited in response to a peptide comprising a particular polymorphic 
sequence can be tested for their ability to specifically recognize that polymorphic sequence, i.e., to 
bind differentially to a peptide or polypeptide comprising the polymorphic sequence and thus 
distinguish it from a similar peptide or polypeptide containing a different sequence at the same 
position. 
[0560] KITS 

[0561] As set forth herein, the invention provides diagnostic methods, e.g., for determining the 
identity of the allelic variants of polymorphic regions present in the gene loci of genes disclosed 
herein, wherein specific allelic variants of the polymorphic region are associated with cardiovascular 
diseases. In a preferred embodiment, the diagnostic kit can be used to determine whether a subject is 
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at risk of developing a cardiovascular disease. This information could then be used, e.g., to optimize 
treatment of such individuals. 

[0562] In preferred embodiments, the kit comprises a probe or primer which is capable of 
hybridizing to a gene and thereby identifying whether the gene contains an allelic variant of a 
polymorphic region which is associated with a risk for cardiovascular disease. The kit preferably 
further comprises instructions for use in diagnosing a subject as having, or having a predisposition, 
towards developing a cardiovascular disease. The probe or primers of the kit can be any of the probes 
or primers described in this file. 

[0563] Preferred kits for amplifying a region of a gene comprising a polymorphic region of interest 

comprise one, two or more primers. 

[0564] Antibody-based diagnostic methods and kits 

[0565] The invention also provides antibody-based methods for detecting polymorphic patterns in 
a biological sample. The methods comprise the steps of: (i) contacting a sample with one or more 
antibody preparations, wherein each of the antibody preparations is specific for a particular 
polymorphic form of the proteins encoded by genes disclosed herein, under conditions in which a 
stable antigen-antibody complex can form between the antibody and antigenic components in the 
sample; and (ii) detecting any antigen-antibody complex formed in step (i) using any suitable means 
known in the art, wherein the detection of a complex indicates the presence of the particular 
polymorphic form in the sample. 

[0566] Typically, immunoassays use either a labelled antibody or a labelled antigenic component 
(e.g., that competes with the antigen in the sample for binding to the antibody). Suitable labels 
include without limitation enzyme-based, fluorescent, chemiluminescent, radioactive, or dye 
molecules. Assays that amplify the signals from the probe are also known, such as, for example, those 
that utilize biotin and avidin, and enzyme-labelled immunoassays, such as ELISA assays. 
[0567] The present invention also provides kits suitable for antibody-based diagnostic applications. 
Diagnostic kits typically include one or more of the following components: 

[0568] Polymorphism-specific antibodies: The antibodies may be pre-labelled; alternatively, the 
antibody may be unlabelled and the ingredients for labelling may be included in the kit in separate 
containers, or a secondary, labelled antibody is provided; and 

[0569] Reaction components: The kit may also contain other suitably packaged reagents and 
materials needed for the particular immunoassay protocol, including solid-phase matrices, if 
applicable, and standards. 

[0570] The kits referred to above may include instructions for conducting the test. Furthermore, in 
preferred embodiments, the diagnostic kits are adaptable to high-throughput and/or automated 
operation. 
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[0571 ] Drug Targets and Screening Methods 

[0572] According to the present invention, nucleotide sequences derived from genes disclosed 
herein and peptide sequences encoded by genes disclosed herein, particularly those that contain one 
or more polymorphic sequences, comprise useful targets to identify cardiovascular drugs, i.e., 
compounds that are effective in treating one or more clinical symptoms of cardiovascular disease. 
Furthermore, especially when a protein is a multimeric protein that are build of two or more subunits, 
is a combination of different polymorphic subunits very useful. 

[0573] Drug targets include without limitation: (i) isolated nucleic acids derived from the genes 
disclosed herein, and (ii) isolated peptides and polypeptides encoded by genes disclosed herein, each 
of which comprises one or more polymorphic positions. 
[0574] In vitro screening methods 

[0575] In one series of embodiments, an isolated nucleic acid comprising one or more polymorphic 
positions is tested in vitro for its ability to bind test compounds in a sequence-specific manner. The 
methods comprise: (i) providing a first nucleic acid containing a particular sequence at a 
polymorphic position and a second nucleic acid whose sequence is identical to that of the first nucleic 
acid except for a different sequence at the same polymorphic position; (ii) contacting the nucleic 
acids with a multiplicity of test compounds under conditions appropriate for binding; and (iii) 
identifying those compounds that bind selectively to either the first or second nucleic acid sequence. 
[0576] Selective binding as used herein refers to any measurable difference in any parameter of 
binding, such as, e.g., binding affinity, binding capacity, etc. 

[0577] In another series of embodiments, an isolated peptide or polypeptide comprising one or 
more polymorphic positions is tested in vitro for its ability to bind test compounds in a sequence- 
specific manner. The screening methods involve: (i) providing a first peptide or polypeptide 
containing a particular sequence at a polymorphic position and a second peptide or polypeptide 
whose sequence is identical to the first peptide or polypeptide except for a different sequence at the 
same polymorphic position; (ii) contacting the polypeptides with a multiplicity of test compounds 
under conditions appropriate for binding; and (iii) identifying those compounds that bind selectively 
to one of the nucleic acid sequences. 

[0578] In preferred embodiments, high-throughput screening protocols are used to survey a large 
number of test compounds for their ability to bind the genes or peptides disclosed above in a 
sequence-specific manner. 

[0579] Test compounds are screened from large libraries of synthetic or natural compounds. 
Numerous means are currently used for random and directed synthesis of saccharide, peptide, and 
nucleic acid based compounds. Synthetic compound libraries are commercially available from 
Maybridge Chemical Co. (Trevillet, Cornwall, UK), Comgenex (Princeton, N.J.), Brandon Associates 
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(Merrimack, N.H.), and Microsource (New Milford, Conn.). A rare chemical library is available from 
Aldrich (Milwaukee, Wis.). Alternatively, libraries of natural compounds in the form of bacterial, 
fungal, plant and animal extracts are available from e.g., Pan Laboratories (Bothell, Wash.) or 
MycoSearch (N.C.), or are readily producible. Additionally, natural and synthetically produced 
libraries and compounds are readily modified through conventional chemical, physical, and 
biochemical means. 
[0580] In vivo screening methods 

[0581] Intact cells or whole animals expressing polymorphic variants of genes disclosed herein can 
be used in screening methods to identify candidate cardiovascular drugs. 
[0582] In one series of embodiments, a permanent cell line is established from an individual 
exhibiting a particular polymorphic pattern. Alternatively, cells (including without limitation 
mammalian, insect, yeast, or bacterial cells) are programmed to express a gene comprising one or 
more polymorphic sequences by introduction of appropriate DNA. Identification of candidate 
compounds can be achieved using any suitable assay, including without limitation: (i) assays that 
measure selective binding of test compounds to particular polymorphic variants of proteins encoded 
by genes disclosed herein; (ii) assays that measure the ability of a test compound to modify (i.e., 
inhibit or enhance) a measurable activity or function of proteins encoded by genes disclosed herein; 
and (iii) assays that measure the ability of a compound to modify (i.e., inhibit or enhance) the 
transcriptional activity of sequences derived from the promoter (i.e., regulatory) regions of genes 
disclosed herein. 

[0583] In another series of embodiments, transgenic animals are created in which (i) one or more 
human genes disclosed herein, having different sequences at particular polymorphic positions are 
stably inserted into the genome of the transgenic animal; and/or (ii) the endogenous genes disclosed 
herein are inactivated and replaced with human genes disclosed herein, having different sequences at 
particular polymorphic positions. See, e.g., Coffman, SEMIN. Nephrol. 17:404 (1997); Esther et al., 
Lab. Invest. 74:953 (1996); Murakami et al., Blood Press. Suppl. 2:36 (1996). Such animals can 
be treated with candidate compounds and monitored for one or more clinical markers of 
cardiovascular status. 

[0584] All patents and publications mentioned herein are hereby incorporated by reference in their 
entireties. 

[0585] The following are put forth so as to provide those of ordinary skill in the art with a 
complete disclosure and description of how to make and use the compositions of the invention. The 
examples are intended as non-limiting examples of the invention. Efforts have been made to ensure 
accuracy with respect to numbers (e.g., amounts, temperature, etc.), but some experimental error and 
deviations should, of course, be taken into consideration. Unless indicated otherwise, parts are by 
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parts by weight, temperature is degrees centigrade, and pressure is at or near atmospheric. All 
components were obtained commercially unless otherwise indicated. 

Experimental 

[0586] Material and Methods 

[0587] Genotyping of patient DNA with the Pyrosequencing™ Method as described in the patent 
application WO 9813523. 

[0588] A PCR is set up to amplify the flanking regions around a SNP. Therefor 2 ng of genomic 
DNA (patient sample) are mixed with a primerset (20 - 40 pmol) producing a 75 to 320 bp PCR 
fragment with 0.3 to 1 U Qiagens Hot Star Taq Polymerase™ in a total volume of 20 uL. One primer 
is biotinylated depending on the direction of the sequencing primer. To force the biotinylated primer 
to be incorporated it is used 0.8 fold. 

[0589] For primer design, programs like Oligo 6™ (Molecular Biology Insights) or Primer 
Select™ (DNAStar) are used. PCR setup is performed by a BioRobot 3000 ™ from Qiagen. PCR 
takes place in Tl or Tgradient Thermocyclers ™ from Biometra. 

[0590] The whole PCR reaction is transferred into a PSQ plate ™ (Pyrosequencing) and prepared 
using the Sample Prep Tool ™ and SNP Reagent Kit™ from Pyrosequencing according to their 
instructions. 

[0591] Preparation of template for Pyrosequencing™ 

[0592] Sample preparation using PSQ 96 Sample Prep Tool: 

[0593] Mount the PSQ 96 Sample Prep Tool Cover onto the PSQ 96 Sample Prep Tool as follows: 
Place the cover on the desk, retract the 4 attachment rods by separating the handle from the magnetic 
rod holder, fit the magnetic rods into the holes of the cover plate, push the handle downward until a 
click is heard. The PSQ 96 Sample Prep Tool is now ready for use. 

[0594] To transfer beads from one plate to another, place the covered tool into the PSQ 96 Plate 
containing the samples and lower the magnetic rods by separating the handle from the magnetic rod 
holder. Move the tool up and down a few times then wait for 30-60 seconds. Transfer the beads into a 
new PSQ 96 plate containing the solution of choice. 

[0595] Release the beads by lifting the magnetic rod holder, bringing it together with the handle. 
Move the tool up and down a few times to make sure that the beads are released. 
[0596] All steps are performed at room temperature unless otherwise stated. 
[0597] Immobilization of PCR product: 

[0598] Biotinylated PCR products are immobilized on streptavidin-coated Dynabeads™ M-280 
Streptavidin. Parallel immobilization of several samples are performed in the PSQ 96 Plate. 
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[0599] Mix PCR product, 20 \il of a well optimized PCR, with 25 jii 2X BW-buffer II. Add 60-150 
ug Dynabeads. It is also possible to add a mix of Dynabeads and 2X BW-buffer II to the PCR product 
yielding a final BW-buffer II concentration of approximately lx. 

[0600] Incubate at 65°C for 15 min agitation constantly to keep the beads dispersed. For optimal 
immobilization of fragments longer than 300 bp use 30 min incubation time. 
[0601] For strand separation, use the PSQ 96 Sample Prep Tool to transfer the beads with the 
immobilized sample to a PSQ 96 Plate containing 50 ul 0.50 M NaOH per well. Release the beads. 
[0602] After approximately 1 min, transfer the beads with the immobilized strand to a PSQ 96 
Plate containing 99 jal lx Annealing buffer per well and mix thoroughly. 

[0603] Transfer the beads to a PSQ 96 Plate containing 45 \il of a mix of lx Annealing buffer and 
3-15 pmoles sequencing primer per well. 

[0604] Heat at 80°C for 2 minutes in the PSQ 96 Sample Prep Thermoplate and move to room 
temperature. 

[0605] After reaching room temperature, continue with the sequencing reaction. 
[0606] Sequencing reaction: 

[0607] Choose the method to be used ("SNP Method") and enter relevant information in the PSQ 
96 Instrument Control software. 

[0608] Place the cartridge and PSQ 96 Plate in the PSQ 96 Instrument. 
[0609] Start the run. 

[0610] Genotyping using the ABI 7700/7900 instrument (TaqMan): 

[0611] SNP genotypisation using the TaqMan (Applied Biosystems/Perkin Elmer) was performed 
according to the manufacturer's instructions. The TaqMan assay is discussed by Lee et al., NUCLEIC 
Acids Research 21:3761-3766 (1993). 
[0612] Genotyping with a service contractor: 

[0613] Qiagen Genomics, formerly Rapigene, is a service contractor for genotyping SNPs in 
patient samples. Their method is based on a primer extension method where two complementary 
primers are designed for each genotype that are labeled with different tags. Depending on the 
genotype only one primer will be elongated together with a certain tag. This tag can be detected with 
mass spectrometry and is a measure for the respective genotype. The method is described in WO 
9727325 entitled Detection and identification of nucleic acid molecules - using tags which may be 
detected by non-fluorescent spectrometry or potentiometry. 
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Examples 

[0614] To exemplify the present invention and it's utility baySNP 28 will be used in the following: 
[0615] baySNP 28 is a C to T polymorphism and presumably resides in the gene of the human 
acidic 82 kDa protein (information taken from table 3). baySNP 28 was genotyped in various patient 
cohorts using the primers from table 2. As a result the following number of patients carrying different 
genotypes were found (information combined from tables 3 and 5a): 



BAYSNP 


Cohort 


Total 


Genotype 11 
"CC" 


Genotype 12 
"CT" 


Genotype 22 
"TT" 


28 


HELDFEMHIRESP 


12 


l 


2 


9 


28 


HELD_FEM_LORESP 


22 


3 


12 


7 



[0616] When comparing the number of female patients exhibiting a high response to statin therapy 
(HELD FEM HIRESP) with the control cohort (HELD FEM LORESP) it appears that the number 
of low responders carrying the CT genotype is increased. This points to a lower statin response 
among female individuals with the CT genotype. Applying statistical tests on those findings the 
following p-values were obtained (data taken from table 5b): 



baySNP 


Comparison 


Gtype 
Cpval 


Gtype 
Xpval 


Gtype 
LRpval 


28 


Held_fem_eff 


0.0506 


0.0508 


0.0442 



[0617] As at least one of the GTYPE p values is below 0.05 the association of genotype and statin 
response phenotype is regarded as statistically significant, i.e., the analysis of a patient's genotype can 
predict the response to statin therapy. In more detail one can calculate the relative risk to exhibit a 
certain statin response phenotype when carrying a certain genotype (data taken from table 6a): 



baySNP 


Comparison 


GTYPEl 


GTYPE2 


GTYPE3 


RR1 


RR2 


RR3 


28 


HELD_FEM_EFF 


CC 


CT 


TT 


0.68 


0.29 


3.38 



[0618] In case of baySNP 28 the risk to exhibit a high responder phenotype is 3.38 times higher 
when carrying the TT genotype. This indicates that a TT polymorphism in baySNP 28 is an 
independent risk factor for high statin response in females. On the other hand carriers of a CT or CC 
genotype have a reduced risk of being a high responder. 
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[0619] In addition statistical associations can be calculated on the basis on alleles. This calculation 
would identify risk alleles instead of risk genotypes. 

[0620] In case of baySNP 28 the following allele counts were obtained (data combined from tables 
3 and 5a): 



baySNP 


Cohort 


TOTAL 


Allele 1 
"C" 


Allele 2 

II I'll 


28 


HELD_FEM_HERESP 


12 


4 


20 


28 


HELDFEMLORESP 


22 


18 


26 



[0621] When comparing the number of female patients with high statin response 
(HELDFEMHIRESP) with the control cohort (HELD_FEM_LORESP) it appears that the number 
of high responders carrying the T allele is increased, whereas the number of high responders carrying 
the C allele is diminished. This points to a higher statin response among female individuals with the 
T allele. Applying statistical tests on those findings the following p-values were obtained (data taken 
from table 5b): 



baySNP 


Comparison 


Allele 
Cpval 


Allele 
Xpval 


Allele 
LRpval 


28 


HELDFEMEFF 


0.0411 


0.0579 


0.0349 



[0622] As at least one of the ALLELE p values is below 0.05 the association of allele and statin 
response phenotype is regarded as statistically significant (in this example significant p values were 
obtained from two statistical tests). I.e. also the analysis of a patient's alleles from baySNP 28 can 
predict the extend of statin response. In more detail one can calculate the relative risk to exhibit a 
certain statin response phenotype when carrying a certain allele (data taken from table 6b): 



baySNP 


Allele 1 


Allele 2 


Comparison 


RR1 


RR2 


28 


c 


T 


HELD_FEM_EFF 


0.42 


2.39 



[0623] In case of baySNP 28 the risk to exhibit a high responder phenotype is 2.39 times higher 
when carrying the T allele. This indicates that the T allele of baySNP28 is an independent risk factor 
for a high statin response in females. In other words those patients should receive lower doses of 
statins in order to avoid ADR. However due to their 'high responder' phenotype they will still benefit 
from the drug. In turn carriers of the C allele should receive higher drug doses in order to experience 
a benefical therapeutic effect. 
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[0624] Another example is baySNP 29, which is taken to exemplify polymorphisms relevant for 
adverse drug reactions. baySNP 29 was found significant when comparing male patients with severe 
ADR to the respective controls (as defined in table lb). 

[0625] The relative risk ratios for the genotypes AA, AG and GG were as follows (data taken from 
table 6a): 



baySNP 


Comparison 


GtypeI 


GTYPE2 


GTYPE3 


RR1 


RR2 


RR3 


29 


HELDMALADR5ULN 


AA 


AG 


GG 


3.15 


0.66 


0.32 



[0626] In this case male patients carrying the AA genotype have a 3. 15 times higher risk to suffer 
from ADR. In other words those patients should either receive lower doses of statins or switch to an 
alternative therapy in order to avoid ADR. On the other hand male patients with AG or GG genotypes 
appear to be more resistant to ADR and hence better tolerate statin therapy. 
[0627] As can be seen from the following tables some of the associations that are disclosed in the 
present invention are indicative for more than one phenotype. baySNP 1837 is for example linked to 
ADR, but also to the risk to suffer from CVD (table 6). 



Table la 

DEFINITION OF "GOOD" AND "BAD" SERUM LIPID LEVELS 





"Good" 


"Bad" 


LDL-Cholesterol [mg/dL] 


125 -150 


170-200 


Cholesterol [mg/dL] 


190 - 240 


265 -315 


HDL-Cholesterol [mg/dL] 


60-105 


30-55 


Triglycerides [mg/dL] 


45-115 


170-450 



[0628] According to the PROCAM algorithm (Assmann, G. et al., AM J. CARDIOL 77: 1 179-1 184 
(1996)) it is possible to define other cohorts. For example a lipid-based equation would calculate y as 
follows: 

y = -0.0146*LDL+0.0418*HDL-0.3362*In(TRIGLY) 

[0629] Good or bad cohorts could then be defined in the following way (FEM = female, MAL = 
male): 
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FEM_ 


GOOD 


y> 


-1.4 


FEM_ 


BAD 


y< 


-1.4 


MAL 


_GOOD 


y> 


-1.7 


MAL 


BAD 


y< 


-1.7 



Table lb 

DEFINITION OF DRUG RESPONSE PHENOTYPES 



Low responder 


Decrease of serum LDL of at least 10% and at most 50% upon 
administration of 0.8 mg Cerivastatin (female patients) 


High responder 


Decrease of serum LDL of at least 50% upon administration of 0.4 mg 
Cerivastatin (female patients) 


Very low responder 


Dprrease of senim LDL of at 1ea<st 10% and at most 15% unon 
administration of 0.8 mg Cerivastatin (female patients) 


Very high responder 


Decrease of serum LDL of at least 55% upon administration of 0.4 mg 
Cerivastatin ( female natients^ 


Ultra low responder 


Decrease of serum LDL of at least 10% and at most 25% upon 
administration of 0.8 mg Cerivastatin (female patients) 


Ultra high responder 


Decrease of serum LDL of at least 60% upon administration of 0.4 mg 
Cerivastatin (female patients) 


Tolerant patient 


No diagnosis of muscle cramps, muscle pain, muscle weakness, myalgia 
or myopathy AND serum CK levels below 70 mg/dl in women and 
below 80 mg/dl in men. 


ADR patient 
(CK increase at least 
2xULN) 


Diagnosis of muscle cramps, muscle pain, muscle weakness, myalgia or 
myopathy OR serum CK levels higher than 140 mg/dl in women and 
160 mg/dl in men. 


Advanced ADR patient 
[ADR3] (advanced CK 
increase, at least 3xULN)* 


Serum CK levels higher than 210 mg/dl in women and 240 mg/dl in 
men 


Severe ADR patient 
[ADR5] (severe CK 
increase, at least SxULN)* 


Serum CK levels higher than 350 mg/dl in women and 400 mg/dl in 
men 


* When assembling the cohorts for advanced and severe ADR, focus was on the CK serum levels as 
those provide a more independent measure of statin related ADR. 
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Table lc 

DEFINITION OF "HIGH" AND "LOW" SERUM HDL CHOLESTEROL LEVELS 





Male 
individuals 


Female 
individuals 


"High" HDL-Cholesterol [mg/dL] 


>80 


>104 


"Low" HDL-Cholesterol [mg/dL] 


<35 


<37 



[0630] An informed consent was signed by the patients and control people. Blood was taken by a 
physician according to medical standard procedures. 

[0631] Samples were collected anonymous and labeled with a patient number. 
[0632] DNA was extracted using kits from Qiagen. 

Table 2a 

OLIGONUCLEOTIDE PRIMERS USED FOR GENOTYPING USING MASS SPECTROMETRY 
[0633] The baySNP number refers to an internal numbering of the PA SNPs. Primer sequences are 
listed for preamplification of the genomic fragments (primers EF and ER) and for subsequent allele 
specific PCR of the SNP. 



baySNP 


SNP 


Name 


Sequence 


28 


C137T 


CF 


gggacggtcggtagatTCTAGAATTGTGCTTCCC 


28 


C137T 


EF 


TGTCCAGTGTTAGGAAAAA 


28 


C137T 


ER 


GACGATGCCTTCAGCACAGATGTGGCTTCTGTATGAG 


28 


C137T 


TF 


gctggctcggtcaagaTCTAGAATTGTGCTTCCT 


29 


A464G 


AF 


gggacggtcggtagatCATCGGTCAGTGTCCCCA 


29 


A464G 


EF 


GATGTCTGTCTCCTTGATGT 


29 


A464G 


ER 


GACGATGCCTTCAGCACAATGTGGGGGTTTTATTTT 


29 


A464G 


GF 


gctggctcggtcaagaCATCGGTCAGTGTCCCCG 


52 


C397G 


CR 


gggacggtcggtagatTATTTTATAATGCAAAAG 


52 


C397G 


EF 


GACGATGCCTTCAGCACAGTGAATTGCCAGATTAGTG 


52 


C397G 


ER 


TCTAAAGTGCTGGGATTG 


52 


C397G 


GR 


gctggctcggtcaagaTATTTTATAATGCAAAAC 


56 


A429G 


AF 


gggacggtcggtagatAAGGTCTTTGTACGTGTA 


56 


A429G 


EF 


CCAGGTACTGCCTTACAAA 


56 


A429G 


ER 


GACGATGCCTTCAGCACAGCTCCCAAAATAAATCACTC 


56 


A429G 


GF 


gctggctcggtcaagaAAGGTCTTTGTACGTGTG 
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baySNP 


SNP 


Name 


Sequence 


89 


A159G 


AR 


gggacggtcggtagatTGGAGTCGGGGGAGTCAT 


89 


A159G 


EF 


GACGATGCCTTCAGCACATAGTTCAAGGGTAAAGGA 


89 


A159G 


ER 


GAGGACGAGATGTAAGAG 


89 


A159G 


GR 


gctggctcggtcaagaTGGAGTCGGGGGAGTCAC 


90 


C154T 


CF 


gggacggtcggtagatCAGCGCATCCTGAACCAC 


90 


C154T 


EF 


GCTGGAACGAGTTCATCCT 


90 


C154T 


ER 


GACGATGCCTTCAGCACAGGACCCCACCTTTCTTGT 


90 


C154T 


TF 


gctggctcggtcaagaCAGCGCATCCTGAACCAT 


99 


C58T 


CR 


gggacggtcggtagatTCCTGCTCTTTTCTCTAG 


99 


C58T 


EF 


GACGATGCCTTCAGCACACACTGACTGCTTACTCTACC 


99 


C58T 


ER 


TACTGTGTCTCAGCTCCA 


99 


C58T 


TR 


gctggctcggtcaagaTCCTGCTCTTTTCTCTAA 


140 


C468T 


CR 


gggacggtcggtagatGTGAATCCCAATACGAAG 


140 


C468T 


EF 


GACGATGCCTTCAGCACATAAAAAATAACCAGGTACTCCA 


140 


C468T 


ER 


GATGAGTCCTTCACCAAACATACA 


140 


C468T 


TR 


gctggctcggtcaagaGTGAATCCCAATACGAAA 


152 


A587G 


AF 


gggacggtcggtagatGGTGGGAGGTTCCAGCCA 


152 [ 


A587G 


EF 


GCAGGAAGAAAGCTAGAA 


152 


A587G 


ER 


GACGATGCCTTCAGCACAAGGCAGGATAATGACAAC 


152 


A587G 


GF 


gctggctcggtcaagaGGTGGGAGGTTCCAGCCG 


214 


A209G 


AF 


gggacggtcggtagatCATTTCCACCTCACCAAA 


214 


A209G 


EF 


AGGTATTCCCGGCGTTTC 


214 


A209G 


ER 


GACGATGCCTTCAGCACATGTTGTGCGTCTGCTTCC 


214 


A209G 


GF 


gctggctcggtcaagaCATTTCCACCTCACCAAG 


221 


C339G 


CF 


gggacggtcggtagatTGTGAAGAACTGTTGCTC 


221 


C339G 


EF 


CTGAAGCTCATCTGCCTTCT 


221 


C339G 


ER 


GACGATGCCTTCAGCACATCCCCTTCCTTCTTACCT 


221 


C339G 


GF 


gctggctcggtcaagaTGTGAAGAACTGTTGCTG 


224 


C189T 


CR 


gggacggtcggtagatGCCCGCTTTTCTTCATCG 


224 


C189T 


EF 


GACGATGCCTTCAGCACACTGTCTTCAAGGGCTTACAC 


224 


C189T 


ER 


TCCAACTTCAGGCAAAAC 


224 


C189T 


TR 


gctggctcggtcaagaGCCCGCTTTTCTTCATCA 


294 


C465T 


CR 


gggacggtcggtagatCCCAAGGCCAACAGGGAG 
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baySNP 


SNP 


Name 


Sequence 


294 


C465T 


EF 


GACGATGCCTTCAGCACAGCATTCTTATGCCAGTGTTC 


294 


C465T 


ER 


ATCCATCCCATCCTGTGT 


294 


C465T 


TR 


gctggctcggtcaagaCCCAAGGCCAACAGGGAA 


307 


C215T 


CR 


gggacggtcggtagatGAGTGGGTGCTGTTCCCG 


307 


C215T 


EF 


GACGATGCCTTCAGCACAGTTACTGCCTCTCTGACC 


307 


C215T 


ER 


AGTGTGACCTGCTCTCTT 


307 


C215T 


TR 


gctggctcggtcaagaGAGTGGGTGCTGTTCCCA 


411 


A369T 


ER 


gacgatgccttcagcacaAACACATTCCCCCTCTAC 


411 


A369T 


EF 


GTCTCTATTCCAAGCCAAG 


411 


A369T 


AF 


gggacggtcggtagatCCCCGCTCCAGCTCCTCA 


411 


A369T 


TF 


gctggctcggtcaagaCCCCGCTCCAGCTCCTCT 


449 


C323G 


CR 


gggacggtcggtagatCCGCTTCTGCTTCTGCTG 


449 


C323G 


EF 


GACGATGCCTTCAGCACAAGGAGAAGAGGGAGGAGA 


449 


C323G 


ER 


GGAGCACGTAAGGAGAAA 


449 


C323G 


GR 


gctggctcggtcaagaCCGCTTCTGCTTCTGCTC 


466 


C123T 


CF 


gggacggtcggtagatGGCCAGGGGCTGGAGGGC 


466 


C123T 


EF 


TCTTCAGTTCTCTCAGCTTC 


466 


C123T 


ER 


GACGATGCCTTCAGCACATCACTAGGGGCTCTTACC 


I 466 


C123T 


TF 


gctggctcggtcaagaGGCCAGGGGCTGGAGGGT 


472 


A497G 


AR 


gggacggtcggtagatTCCTCCCGCTGCTTCAGT 


472 


A497G 


EF 


GACGATGCCTTCAGCACATCACTTACCCATCATACTTCTTTTTC 


472 


A497G 


ER 


AATCCTGCCTCCCACCTT 


472 


A497G 


GR 


gctggctcggtcaagaTCCTCCCGCTGCTTCAGC 


542 


A402G 


AR 


gggacggtcggtagatAGAAATTCCCTCCCAACT 


542 


A402G 


EF 


GACGATGCCTTCAGCACATGATTGAGCCAGTTGTTT 


542 


A402G 


ER 


GGGGTGTATTTTGAGAGTG 


542 


A402G 


GR 


gctggctcggtcaagaAGAAATTCCCTCCCAACC 


739 


C87G 


CR 


gggacggtcggtagatGCTGGTTTGACTGGACGG 


739 


C87G 


EF 


GACGATGCCTTCAGCACAACCTTGGTATAATCCTTTCC 


739 


C87G 


ER 


AGGCAACCTAATCCACTT 


739 


C87G 


GR 


gctggctcggtcaagaGCTGGTTTGACTGGACGC 


821 


A HOC 


AF 


gggacggtcggtagatAGTGCTGTGATACCTGGA 


821 


A HOC 


CF 


gctggctcggtcaagaAGTGCTGTGATACCTGGC 
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baySNP 


SNP 


Name 


Sequence 


821 


A140C 


EF 


ACACCCACAAAACAAGAA 


821 


A HOC 


ER 


GACGATGCCTTCAGCACAGGAACAAGGACATAAAAGAG 


1005 


A257G 


AR 


gggacggtcggtagatAGGAAATGTTAGCCCTGT 


1005 


A257G 


EF 


GACGATGCCTTCAGCACACTCCACTTCTCTATGCCTC 


1005 


A257G 


ER 


GTCCCCAGCTATGTATTGT 


1005 


A257G 


GR 


gctggctcggtcaagaAGGAAATGTTAGCCCTGC 


1055 


A287T 


AF 


gggacggtcggtagatCTCAGGGAGGGAGAGAGA 


1055 


A287T 


EF 


GGGACAGACAGACAGACA 


1055 


A287T 


ER 


GACGATGCCTTCAGCACACAACTCCTTCTTCAGCAC 


1055 


A287T 


TF 


gctggctcggtcaagaCTCAGGGAGGGAGAGAGT 


1056 


A354G 


AR 


gggacggtcggtagatGCGGCTGCCCCGTCCTGT 


1056 


A354G 


EF 


GACGATGCCTTCAGCACAGTGTGTCTATGTGTCTGTGTG 


1056 


A354G 


ER 


CGGACTTCTCCTTCTTGT 


1056 


A354G 


GR 


gctggctcggtcaagaGCGGCTGCCCCGTCCTGC 


1085 


A251G 


EF 


TAGGGTAAGCAGCAAGAG 


1085 


A251G 


ER 


CACAAGGCAAGAGATAACA 


1085 


A251G 


AF 


gggacggtcggtagatCAGGCAAGATAGACAGCA 


1085 


A251G 


GF 


gctggctcggtcaagaCAGGCAAGATAGACAGCG 


1086 


A104G 


EF 


GTGCCCATACGAACAGAATAG 


1086 


A104G 


ER 


TGCCAAGTACCCCAAGAG 


1086 


A104G 


AR 


gggacggtcggtagatCCATTCCTCCCCAGACAT 


1086 


A104G 


GR 


gctggctcggtcaagaCCATTCCTCCCCAGACAC 


1092 


C1687G 


CF 


gggacggtcggtagatCGTGCGAGCAGCGAAAGC 


1092 


C1687G 


EF 


CCAGAGAGAAGTCGAGGAAGAGA 


1092 


C1687G 


ER 


GACGATGCCTTCAGCACAGTCACCCCCAAAAGCAGG 


1092 


C1687G 


GF 


gctggctcggtcaagaCGTGCGAGCAGCGAAAGG 


1096 


G454T 


EF 


GACGATGCCTTCAGCACACTTTTCCTCCTAGCCCAC 


1096 


G454T 


ER 


AAGTGATGTAACCCTCCTCTC 


1096 


G454T 


GR 


gggacggtcggtagatTCAGCTATAAATAGGGCC 


1096 


G454T 


TR 


gctggctcggtcaagaTCAGCTATAAATAGGGCA 


1101 


C249T 


CR 


gggacggtcggtagatTGATGGCGGGTGCCAAGG 


1101 


C249T 


EF 


GACGATGCCTTCAGCACAGCTCTTTCCTTTGCTTCC 


1101 


C249T 


ER 


CACTGGGGGTCCTCTTAC 
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baySNP 


SNP 


Name 


Sequence 


1101 


C249T 


TR 


gctggctcggtcaagaTGATGGCGGGTGCCAAGA 


1204 


A307G 


AR 


gggacggtcggtagatCAAGGGCACTCACATTAT 


1204 


A307G 


EF 


GACGATGCCTTCAGCACAGCTCTTGCGTCTGTTTCC 


1204 


A307G 


ER 


TTTCCCTTCTGTCCCCTT 


1204 


A307G 


GR 


gctggctcggtcaagaCAAGGGCACTCACATTAC 


1504 


C180T 


CF 


gggacggtcggtagatGTGACTTTTGGTTCCCAC 


1504 


C180T 


EF 


AACTCGGGGTCACTGGTCT 


1504 


C180T 


ER 


GACGATGCCTTCAGCACACAGCGGGTATGGAGGATG 


1504 


C180T 


TF 


gctggctcggtcaagaGTGACTTTTGGTTCCCAT 


1511 


G153T 


EF 


ACACCAGTTCTCCCTCCT 


1511 


G153T 


ER 


GACGATGCCTTCAGCACACCCACCTTTCCTAATCCT 


1511 


G153T 


GF 


gggacggtcggtagatTTGGGACTCTGCGTCAAG 


1511 


G153T 


TF 


gctggctcggtcaagaTTGGGACTCTGCGTCAAT 


1524 


A284C 


AF 


gggacggtcggtagatCTCTCAAAGCCCACACAA 


1524 


A284C 


CF 


gctggctcggtcaagaCTCTCAAAGCCCACACAC 


1524 


A284C 


EF 


AGAAAAAGAAAAGGAAAAAGA 


1524 


A284C 


ER 


GACGATGCCTTCAGCACAGGAAAGTTACAAGGCTATGA 


1556 


C367G 


CR 


gggacggtcggtagatACCTGCCTCTAAGGTCTG 


1556 


C367G 


EF 


GACGATGCCTTCAGCACAAGGAGAAGACAGTTCAAGG 


1556 


C367G 


ER 


ACAGTTGCCAGAGAAAAG 


1556 


C367G 


GR 


gctggctcggtcaagaACCTGCCTCTAAGGTCTC 


1561 


A251C 


EF 


TCACTTGCCTCTACTCCA 


1561 


A251C 


ER 


ATACCAGAAAGACTAAGCTCC 


1561 


A251C 


AF 


gggacggtcggtagatGGGTGAGCTCTGTGGGCA 


1561 


A251C 


CF 


gctggctcggtcaagaGGGTGAGCTCTGTGGGCC 


1582 


C389T 


CR 


gggacggtcggtagatCCAAGGGTTATGGCAGGG 


1582 


C389T 


EF 


GACGATGCCTTCAGCACACCTGACTATTTGGGGTTGTG 


1582 


C389T 


ER 


ATCGCTCTCTGCTTCTGCT 


1582 


C389T 


TR 


gctggctcggtcaagaCCAAGGGTTATGGCAGGA 


1638 


A443G 


AR 


gggacggtcggtagatCCAAAACCCCAGCGCTGT 


1638 


A443G 


EF 


GACGATGCCTTCAGCACACTCTTTATCCTGCTTATGGT 


1638 


A443G 


ER 


CCAAGCTCACTCTGTAGG 


1638 


A443G 


GR 


gctggctcggtcaagaCCAAAACCCCAGCGCTGC 
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baySNP 


SNP 


Name 


Sequence 


1662 


C251T 


EF 


AATACAATGGAAGCCAAG 


1662 


C251T 


ER 


CCTAATCGAACAGAAAGG 


1662 


C251T 


CF 


gggacggtcggtagatCCAGTCTCCATCCACTTC 


1662 


C251T 


TF 


gctggctcggtcaagaCCAGTCTCCATCCACTTT 


1714 


A376G 


AF 


gggacggtcggtagatTGAACGGCATGACGGGGA 


1714 


A376G 


EF 


AAGTGTTTCTGCTGTGCCT 


1714 


A376G 


ER 


GACGATGCCTTCAGCACACAAGTCCTGGTTTTCCATC 


1714 


A376G 


GF 


gctggctcggtcaagaTGAACGGCATGACGGGGG 


1722 


C89T 


CF 


gggacggtcggtagatACCCCAGGATGCCCACAC 


1722 


C89T 


EF 


GTTTATCCTCCTCATGTCC 


1722 


C89T 


ER 


GACGATGCCTTCAGCACAGTTACCTTTTCCACCTCTC 


1722 


C89T 


TF 


gctggctcggtcaagaACCCCAGGATGCCCACAT 


1757 


A210G 


AF 


gggacggtcggtagatGGAAACAAACCAAAATGA 


1757 


A210G 


EF 


CCAGCACCCAAAATAAGA 


1757 


A210G 


ER 


GACGATGCCTTCAGCACAATAAGTTGAAGCCCTCCC 


1757 


A210G 


GF 


gctggctcggtcaagaGGAAACAAACCAAAATGG 


1765 


A240G 


AF 


gggacggtcggtagatGGCTTCACGGAGGAAGAA 


1765 


A240G 


EF 


TTAGGAGCTGTGAGGTATG 


1765 


A240G 


ER 


GACGATGCCTTCAGCACATAAGATGGAGCAGGGTAG 


1765 


A240G 


GF 


gctggctcggtcaagaGGCTTCACGGAGGAAGAG 


1776 


A200G 


AF 


gggacggtcggtagatAAAGGGCTCCCAACACCA 


1776 


A200G 


EF 


TGAGCACAAGATCAGAGAGG 


1776 


A200G 


ER 


GACGATGCCTTCAGCACAAGACAGAGACGCAGGAATG 


1776 


A200G 


GF 


gctggctcggtcaagaAAAGGGCTCCCAACACCG 


1799 


C370T 


CF 


gggacggtcggtagatAGGGACAACCAAAGTGAC 


1799 


C370T 


EF 


ATCATCAGAACAGCCCTAC 


1799 


C370T 


ER 


GACGATGCCTTCAGCACACAAGCCCACCTACTTACTC 


1799 


C370T 


TF 


gctggctcggtcaagaAGGGACAACCAAAGTGAT 


1806 


A201G 


AF 


gggacggtcggtagatTGGGCGTCCTGGTGGGCA 


1806 


A201G 


EF 


TCTTCGGGCTAACTCTTT 


1806 


A201G 


ER 


GACGATGCCTTCAGCACACTGTCACTCCAAACCTTCT 


1806 


A201G 


GF 


gctggctcggtcaagaTGGGCGTCCTGGTGGGCG 


1837 


C413T 


CF 


gggacggtcggtagatCTCAGCTTCATGCAGGGC 
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1837 


C413T 


EF 


CCCACTCAGCCCTGCTCTT 


1837 


C413T 


ER 


GACGATGCCTTCAGCACAGCATCCTTGGCGGTCTTG 


1837 


C413T 


TF 


gctggctcggtcaagaCTCAGCTTCATGCAGGGT 


1870 


C323T 


CF 


gggacggtcggtagatCTCCTCATTGCCTCCTTC 


1870 


C323T 


EF 


CACCTCTTTTCTCCTTCTCTT 


1870 


C323T 


ER 


GACGATGCCTTCAGCACACCCACCCCCTCTATCTAC 


1870 


C323T 


TF 


gctggctcggtcaagaCTCCTCATTGCCTCCTTT 


1882 


C115T 


CR 


gggacggtcggtagatGTCCCCCACAAGTCCTCG 


1882 


C115T 


EF 


GACGATGCCTTCAGCACAGACCTGTACCCTTTACCC 


1882 


C115T 


ER 


TGTTTCCCTGTCTGTTTC 


1882 


C115T 


TR 


gctggctcggtcaagaGTCCCCCACAAGTCCTCA 


1988 


C214T 


CF 


gggacggtcggtagatGTGACTCGGTCCTATACC 


1988 


C214T 


EF 


GTGGGCTGTGATTGTGTT 


1988 


C214T 


ER 


GACGATGCCTTCAGCACATCTCGTCGTCGTAGTAGTTGT 


1988 


C214T 


TF 


gctggctcggtcaagaGTGACTCGGTCCTATACT 


2000 


C349T 


CR 


gggacggtcggtagatAGTATGGTAATTAGGAAG 


2000 


C349T 


EF 


GACGATGCCTTCAGCACACTGACACTGAGCCACAAC 


2000 


C349T 


ER 


AACTGATGAGCAAGAAGGA 


2000 


C349T 


TR 


gctggctcggtcaagaAGTATGGTAATTAGGAAA 


2071 


A338G 


AR 


gggacggtccgtagatAAAATTGTTTCCTGTGAT 


2071 


A338G 


EF 


GACGATGCCTTCAGCACACATTGCTATTCTCAGGCTATA 


2071 


A338G 


ER 


CCCATTCTCTGCTTGACAGT 


2071 


A338G 


GR 


gctggctcggtcaagaAAAATTGTTTCCTGTGAC 


2078 


G876T 


EF 


CCAGAGAGGGGATAAAGA 


2078 


G876T 


ER 


GACGATGCCTTCAGCACAGAGTGTCAAGAGGAACAGG 


2078 


G876T 


GF 


gggacggtcggtagatTGGCTGCTGAGGTCTGAG 


2078 


G876T 


TF 


gctggctcggtcaagaTGGCTGCTGAGGTCTGAT 


2085 


G415T 


EF 


GCTTTTTCTTTTCATTACATC 


2085 


G415T 


ER 


GACGATGCCTTCAGCACACCTCTTTTAGAATCAGAGACA 


2085 


G415T 


GF 


gggacggtcggtagatGGTAGTGTTACCAGAAAG 


2085 


G415T 


TF 


gctggctcggtcaagaGGTAGTGTTACCAGAAAT 


2095 


A406G 


AR 


gggacggtcggtagatTGTGCACCGGGATATTTT 


2095 


A406G 


EF 


GACGATGCCTTCAGCACAATGTGTGCTTGGGTTCTT 
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2095 


A406G 


ER 


GGTGTTTCTCCTCCTCTCT 


2095 


A406G 


GR 


gctggctcggtcaagaTGTGCACCGGGATATTTC 


2119 


A67G 


AR 


gggacggtcggtagatGTGGGCACCAAACGCTAT 


2119 


A67G 


EF 


GACGATGCCTTCAGCACAGATGTAGGGCTGGAAGTG 


2119 


A67G 


ER 


TCAAGAAAAATGGGAGTTG 


2119 


A67G 


GR 


gctggctcggtcaagaGTGGGCACCAAACGCTAC 


2141 


A176G 


EF 


TGTAGCATCGGTAGGTTC 


2141 


A176G 


ER 


CAACATCAGACTTTCTTTTTC 


2141 


A176G 


AR 


gggacggtcggtagatTGGTACAGGGCTAGTTTT 


2141 


A176G 


GR 


gctggctcggtcaagaTGGTACAGGGCTAGTTTC 


2182 


A318G 


AF 


gggacggtcggtagatAGGCGGGCCAAGGGTGAA 


2182 


A318G 


EF 


TTCTCTCTCCCCTTCTGT 


2182 


A318G 


ER 


GACGATGCCTTCAGCACATAAATGTTCACTCTTCTTGCT 


2182 


A318G 


GF 


gctggctcggtcaagaAGGCGGGCCAAGGGTGAG 


2234 


G296T 


EF 


GGGTTGTTCCAGGGCGCTATT 


2234 


G296T 


ER 


GACGATGCCTTCAGCACATGTGGAGAGGCCGGGTGC 


2234 


G296T 


GF 


gggacggtcggtagatGAACCAGCCCCCTGGAAG 


2234 


G296T 


TF 


gctggctcggtcaagaGAACCAGCCCCCTGGAAT 


2281 


A227C 


AR 


gggacggtcggtagatCAGGCTTGGAGACCTGGT 


2281 


A227C 


CR 


gctggctcggtcaagaCAGGCTTGGAGACCTGGG 


2281 


A227C 


EF 


GACGATGCCTTCAGCACAGGGTATTCAGTTGGAAGG 


2281 


A227C 


ER 


AAGGCAAGGTTCTTAGTTG 


2298 


A77C 


AR 


gggacggtcggtagatTCTAAAAGCACTTGAAAT 


2298 


A77C 


CR 


gctggctcggtcaagaTCTAAAAGCACTTGAAAG 


2298 


A77C 


EF 


GACGATGCCTTCAGCACACCTGCTAGTGTTTTCTGG 


2298 


A77C 


ER 


TGTAACTGATAGGTGGTGG 


2341 


C286T 


CR 


gggacggtccgtagatTGAAGATTCTGCTCAGCG 


2341 


C286T 


EF 


GACGATGCCTTCAGCACAAGGGCCCGGGACTCAT 


2341 


C286T 


ER 


TTTGGGGTCCTGCGGATG 


2341 


C286T 


TR 


gctggctcggtcaagaTGAAGATTCTGCTCAGCA 


2357 


A165G 


AF 


gggacggtcggtagatCAAAGAAGACGAAAATGA 


2357 


A165G 


EF 


CTCAAGTTTGTTACTGATTTCTC 


2357 


A165G 


ER 


GACGATGCCTTCAGCACAGGGTTACGTCTGCTCTTC 
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2357 


A165G 


GF 


gctggctcggtcaagaCAAAGAAGACGAAAATGG 


2366 


G50T 


EF 


GACGATGCCTTCAGCACACTGCTCCGAAACACGGTC 


2366 


G50T 


ER 


GCATCTTCAGCCCTTCTTACTCT 


2366 


G50T 


GR 


gggacggtcggtagatCTCCTGGGCACCACGGGC 


2366 


G50T 


TR 


gctggctcggtcaagaCTCCTGGGCACCACGGGA 


2995 


A299C 


ER 


gacgatgccttcagcacaTGGGATTAGACACGAGAG 


2995 


A299C 


EF 


AAAGAACTGGAAGAAGGAA 


2995 


A299C 


AF 


gggacggtcggtagatGTCACCTCCTTTCCACTA 


2995 


A299C 


CF 


gctggctcggtcaagaGTCACCTCCTTTCCACTC 


3360 


G777T 


ER 


gacgatgccttcagcacaAGAAAAATGAGAGGGAAAAC 


3360 


G777T 


EF 


GATGAAGGGAAATGGAAC 


3360 


G777T 


GF 


gggacggtcggtagatCCAACTATATAGGAGCCG 


3360 


G777T 


TF 


gctggctcggtcaagaCCAACTATATAGGAGCCT 


3464 


A110G 


EF 


CTGAACCGAGGAGATTTTT 


3464 


A110G 


ER 


TGATGCTTACAGAACTGGG 


3464 


A110G 


AF 


gggacggtcggtagatGTGTAGTGGGCAGGGTTA 


3464 


A110G 


GF 


gctggctcggtcaagaGTGTAGTGGGCAGGGTTG 


3975 


A65C 


EF 


gacgatgccttcagcacaAAAAGAACCCTGGTGAAG 


3975 


A65C 


ER 


CCCTGATAAAAGAGATGGA 


3975 


A65C 


AR 


gggacggtcggtagatCGCATGGGAGTCAGGGAT 


3975 


A65C 


CR 


gctggctcggtcaagaCGCATGGGAGTCAGGGAG 


3976 


A239G 


EF 


gacgatgccttcagcacaATGAGGGAGCAAGACAAG 


3976 


A239G 


ER 


TGATAAAAGAGATGGAAGGAG 


3976 


A239G 


AR 


gggacggtcggtagatGTCACTGTTTGTCACTGT 


3976 


A239G 


GR 


gctggctcggtcaagaGTCACTGTTTGTCACTGC 


4206 


A304T 


EF 


gacgatgccttcagcacaCTTTTTAGCCAAGTGGAG 


4206 


A304T 


ER 


GGATCTGAGGAATCTGTG 


4206 


A304T 


AR 


gggacggtcggtagatACCAGGCAGAGAGAAAAT 


4206 


A304T 


TR 


gctggctcggtcaagaACCAGGCAGAGAGAAAAA 


4912 


A74G 


EF 


CTTCACTGAGCGTCCGCAGAG 


4912 


A74G 


ER 


CCGTCGGCCCGATTCA 


4912 


A74G 


AR 


CAGGCGAGCCTCAGCCCT 


4912 


A74G 


GR 


CAGGCGAGCCTCAGCCCC 
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4925 


A251C 


EF 


TCATTTCCCAATTTACCTCC 


4925 


A251C 


ER 


CCTCTTTCCCATCTCCCT 


4925 


A251C 


AF 


gggacggtcggtagatAGCCAGGAGCCTGCGTCA 


4925 


A251C 


CF 


gctggctcggtcaagaAGCCAGGAGCCTGCGTCC 


4966 


A251G 


EF 


CATTGCTCTTCCTCTCTGT 


4966 


A251G 


ER 


GTGTCATCATTCCTTTCTTG 


4966 


A251G 


AR 


gggacggtcggtagatTCAGAGACATGAGTCCAT 


4966 


A251G 


GR 


gctggctcggtcaagaTCAGAGACATGAGTCCAC 


5014 


A2057G 


ER 


gacgatgccttcagcacaCACCTGTCCCACCCTATTT 


5014 


A2057G 


EF 


GTCCTGAACCCCCATTCT 


5014 


A2057G 


AF 


gggacggtcggtagatGCCTGCACTGCGTTCCTA 


5014 


A2057G 


GF 


gctggctcggtcaagaGCCTGCACTGCGTTCCTG 


5296 


A251G 


EF 


GCTCCTCTGCCTTCTGCTT 


5296 


A251G 


ER 


ATTTGCCCACTGCCCTTC 


5296 


A251G 


AF 


gggacggtcggtagatTGGCTGCAGGTGCGTCCA 


5296 


A251G 


GF 


gctggctcggtcaagaTGGCTGCAGGTGCGTCCG 


5298 


C172T 


EF 


GCCACACACACCTTAACA 


5298 


C172T 


ER 


AAAGTTCTCTGCCTCCAA 


5298 


C172T 


CF 


gggacggtcggtagatAGCTCTCAGCTGGGGTGC 


5298 


C172T 


TF 


gctggctcggtcaagaAGCTCTCAGCTGGGGTGT 


5457 


A134G 


EF 


AGCAGAATGGGCAATAGA 


5457 


A134G 


ER 


AGAGATGTGGGCAGAGAA 


5457 


A134G 


AF 


gggacggtcggtagatGGAAAGCCTACTTTCTTA 


5457 


A134G 


GF 


gctggctcggtcaagaGGAAAGCCTACTTTCTTG 


5704 


C61T 


EF 


ACAGCCATAACAGGAGTG 


5704 


C61T 


ER 


GGGTTACTCAACCTAAGAGA 


5704 


C61T 


CR 


gggacggtcggtagatGTTCTCTTTGGGAAAACG 


5704 


C61T 


TR 


gctggctcggtcaagaGTTCTCTTTGGGAAAACA 


5717 


A1960G 


EF 


gacgatgccttcagcacaGAACAGAAACCACAGAACC 


5717 


A1960G 


ER 


GTCCCACCCTATTTTGAG 


5717 


A1960G 


AR 


gggacggtcggtagatCACTGGCCCACCTCCCTT 


5717 


A1960G 


GR 


gctggctcggtcaagaCACTGGCCCACCTCCCTC 


5959 


A71G 


EF 


gacgatgccttcagcacaACCATGCCTGACTTAACC 
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5959 


A71G 


ER 


TTGTTTCCTGTCCTCTTTC 


5959 


A71G 


AR 


gggacggtcggtagatGTTAAGAGGCTGGGCAGT 


5959 


A71G 


GR 


gctggctcggtcaagaGTTAAGAGGCTGGGCAGC 


6162 


C340G 


EF 


gacgatgccttcagcacaAGTGTTGTTAGGAGCAAAG 


6162 


C340G 


ER 


CTTAGGAAACTGAGGTGG 


6162 


C340G 


CR 


gggacggtcggtagatCTGCAGCCTGGGCAACAG 


6162 


C340G 


GR 


gctggctcggtcaagaCTGCAGCCTGGGCAACAC 


6236 


C906T 


ER 


gacgatgccttcagcacaTGGACACATTTGAGCTTT 


6236 


C906T 


EF 


CTTCCCCAGAGATGACTAC 


6236 


C906T 


CF 


gggacggtcggtagatCCCCATCCTACTCAGCAC 


6236 


C906T 


TF 


gctggctcggtcaagaCCCCATCCTACTCAGCAT 


6744 


C348T 


ER 


gacgatgccttcagcacaGGTTACAGTGAGCCAAGA 


6744 


C348T 


EF 


AGGTGAAGAAAGCAAAATAC 


6744 


C348T 


CF 


gggacggtcggtagatTGGTGTGTGTTTTGTTTC 


6744 


C348T 


TF 


gctggctcggtcaagaTGGTGTGTGTTTTGTTTT 


7133 


C63G 


EF 


TTGAGACCCTACAGAGCCA 


7133 


C63G 


ER 


GGCAAGCTGAGGTGAAAG 


7133 


C63G 


CR 


gggacggtcggtagatAATAAGGTAAGAAATGAG 


7133 


C63G 


GR 


gctggctcggtcaagaAATAAGGTAAGAAATGAC 


8210 


A251G 


EF 


TAATTTCTAATGGCCTTCC 


8210 


A251G 


ER 


TCACTTACTCCCTGATGTCT 


8210 


A251G 


AR 


gggacggtcggtagatCATTGGGTTTTCCCTCAT 


8210 


A251G 


GR 


gctggctcggtcaagaCATTGGGTTTTCCCTCAC 


8592 


C46T 


ER 


gacgatgccttcagcacaACATTTAGTGCCAACATCAC 


8592 


C46T 


EF 


CTCTTCCCTGAGACACCA 


8592 


C46T 


CF 


gggacggtcggtagatGAAGGTGAAGGCCAGAGC 


8592 


C46T 


TF 


gctggctcggtcaagaGAAGGTGAAGGCCAGAGT 


8943 


A251C 


EF 


GAGGCTGAGACAGAAGAA 


8943 


A251C 


ER 


GTTTGACATTAAAGAAAATGAG 


8943 


A251C 


AR 


gggacggtcggtagatGGCTGGAGTGCAGTGATT 


8943 


A251C 


CR 


gctggctcggtcaagaGGCTGGAGTGCAGTGATG 


9193 


C88G 


EF 


CACGCTGTTGAGTGGG 


9193 


C88G 


ER 


CGCAGGTCTACGGTCA 
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9193 


C88G 


CR 


gggacggtcggtagatCCCGGGTCTGAGGCTGCG 


9193 


C88G 


GR 


gctggctcggtcaagaCCCGGGTCTGAGGCTGCC 


9516 


A187G 


EF 


CACACACACACACACACAC 


9516 


A187G 


ER 


GGTCCCTTACTTTCCTCTT 


9516 


A187G 


AR 


gggacggtcggtagatCCTATCCCTACTTCCCCT 


9516 


A187G 


GR 


gctggctcggtcaagaCCTATCCCTACTTCCCCC 


9698 


A251G 


EF 


GTGACCCCAAAAGAGAGA 


9698 


A251G 


ER 


CTAGCTTGTTACTGCCTCC 


9698 


A251G 


AF 


gggacggtcggtagatGGCACGACCCCGCCCCCA 


9698 


A251G 


GF 


gctggctcggtcaagaGGCACGACCCCGCCCCCG 


9883 


A249G 


EF 


TCCACAACCTCAAAACCAC 


9883 


A249G 


ER 


CACAGTCCTGCAAGCTCA 


9883 


A249G 


AR 


gggacggtcggtagatCCGTGGCCGTGGCTCACT 


9883 


A249G 


GR 


gctggctcggtcaagaCCGTGGCCGTGGCTCACC 


10481 


A107T 


ER 


gacgatgccttcagcacaGTTCGGGGCTCCACTT 


10481 


A107T 


EF 


TAGCGGGACAGCGCTG 


10481 


A107T 


AF 


gggacggtcggtagatCCCGGCGCGCCTCGGAGA 


10481 


A107T 


TF 


gctggctcggtcaagaCCCGGCGCGCCTCGGAGT 


10542 


C367T 


EF 


gacgatgccttcagcacaAATACACTGGGTCCTGCT 


10542 


C367T 


ER 


ATACTGCTGGCCTTTCTC 


10542 


C367T 


CR 


gggacggtcggtagatGGTCAGGGGAGCCCAGAG 


10542 


C367T 


TR 


gctggctcggtcaagaGGTCAGGGGAGCCCAGAA 


10600 


A251G 


EF 


CCTGGCAACTAACCTCTT 


10600 


A251G 


ER 


AGGCAGTCTCTCTGTCTACTC 


10600 


A251G 


AR 


gggacggtcggtagatATTGGCCCTGCTCAGGAT 


10600 


A251G 


GR 


gctggctcggtcaagaATTGGCCCTGCTCAGGAC 


10621 


C402T 


EF 


CCAGCCCTAAACCTAAA 


10621 


C402T 


ER 


AACCTCTCAAGATCAGACAC 


10621 


C402T 


CF 


gggacggtcggtagatTTAGCACTTAATAAGTAC 


10621 


C402T 


TF 


gctggctcggtcaagaTTAGCACTTAATAAGTAT 


10745 


A251G 


EF 


CCCCACAACAAAGAAAGA 


10745 


A251G 


ER 


GAAGCCAACTCTCCAACA 


10745 


A251G 


AF 


gggacggtcggtagatCAAGGATTTCAAAAACCA 
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10745 


A251G 


GF 


gctggctcggtcaagaCAAGGATTTCAAAAACCG 


10771 


C64G 


EF 


gacgatgccttcagcacaCCAGGGAAGAGCAGAACC 


10771 


C64G 


ER 


TGTACGGGAAGAGGCAGA 


10771 


C64G 


CR 


gggacggtcggtagatAGGGTGACACAGGCCACG 


10771 


C64G 


GR 


gctggctcggtcaagaAGGGTGACACAGGCCACC 


10870 


A251G 


EF 


ATCCCATCCCAACACACA 


10870 


A251G 


ER 


CCGAGACCAAACTCATTCAC 


10870 


A251G 


AR 


gggacggtcggtagatGGCAGAGCCTGAGTCACT 


10870 


A251G 


GR 


gctggctcggtcaagaGGCAGAGCCTGAGTCACC 


10877 


A251C 


EF 


CCTGTTTCTCAACCTTCTC 


10877 


A251C 


ER 


ATGGTCTATGGAACCTAATCT 


10877 


A251C 


AF 


gggacggtcggtagatGCACTGATTCTGCTTCCA 


10877 


A251C 


CF 


gctggctcggtcaagaGCACTGATTCTGCTTCCC 


10948 


G140T 


EF 


AAGGACAGGGTCAGGAAAG 


10948 


G140T 


ER 


CAGAGGGAGGAAGGAGGT 


10948 


G140T 


GF 


gggacggtcggtagatATGGAGGAGGGTGTCTGG 


10948 


G140T 


TF 


gctggctcggtcaagaATGGAGGAGGGTGTCTGT 


11001 


C286T 


EF 


gacgatgccttcagcacaTTCCCAAAGACCCACA 


11001 


C286T 


ER 


CCTCCACCGCTATCAC 


11001 


C286T 


CR 


gggacggtcggtagatTGGCTGCAGGACGTCCAG 


11001 


C286T 


TR 


gctggctcggtcaagaTGGCTGCAGGACGTCCAA 


11001 


C286T 


EF 


TTCCCAAAGACCCACA 


11001 


C286T 


ER 


CCTCCACCGCTATCAC 


11001 


C286T 


CR 


gggacggtcggtagatTGGCTGCAGGACGTCCAG 


11001 


C286T 


TR 


gctggctcggtcaagaTGGCTGCAGGACGTCCAA 


11073 


C215G 


EF 


CCCAACCACCCGTTCC 


11073 


C215G 


ER 


GCGCGGGAGCTAGAGA 


11073 


C215G 


CF 


gggacggtcggtagatGAAGCTGCGGGCCGGACC 


11073 


C215G 


GF 


gctggctcggtcaagaGAAGCTGCGGGCCGGACG 


11153 


C116T 


EF 


CGAGTGGGAAGAAAAGTAGA 


11153 


C116T 


ER 


ATGACTGCCTGCCTAGAA 


11153 


C116T 


CR 


gggacggtcggtagatAAGATAGGGTAGAGGCCG 


11153 


C116T 


TR 


gctggctcggtcaagaAAGATAGGGTAGAGGCCA 
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11210 


C194T 


EF 


GAGGAGTGAGGGAAAGTAAG 


11210 


C194T 


ER 


AAATGGAGAGAGATGGGA 


11210 


C194T 


CF 


gggacggtcggtagatCCAGGAAATGACATGATC 


11210 


C194T 


TF 


gctggctcggtcaagaCCAGGAAATGACATGATT 


11248 


C225T 


EF 


TGAGTTGAACAGCACTTGG 


11248 


C225T 


ER 


AGGGTAAGGGAGGGAAAA 


11248 


C225T 


CR 


gggacggtcggtagatTGATTCTTTCGCTTGGCG 


11248 


C225T 


TR 


gctggctcggtcaagaTGATTCTTTCGCTTGGCA 


11372 


A251G 


EF 


TAGAAAAGAAGAAAAATCAA 


11372 


A251G 


ER 


ACACACACACACACACAC 


11372 


A251G 


AR 


gggacggtcggtagatCATCACCTTTTAGTTTCT 


11372 


A251G 


GR 


gctggctcggtcaagaCATCACCTTTTAGTTTCC 


11449 


C251G 


EF 


ACAGAAGAACAACAACAAAAC 


11449 


C251G 


ER 


TGCGTATGAGGTAAAGAGA 


11449 


C251G 


CF 


gggacggtcggtagatATGAGTGAAGCCTGTCTC 


11449 


C251G 


GF 


gctggctcggtcaagaATGAGTGAAGCCTGTCTG 


11450 


A251T 


EF 


ACAGAAGAACAACAACAAAAC 


11450 


A251T 


ER 


TGCGTATGAGGTAAAGAGA 


11450 


A251T 


AR 


gggacggtcggtagatGGACCATAATCTTGAAGT 


11450 


A251T 


TR 


gctggctcggtcaagaGGACCATAATCTTGAAGA 


11470 


C251T 


EF 


GCTTGTCTTGTCTGATAGGTG 


11470 


C251T 


ER 


CAACGTGAGAATTTCCAAAAT 


11470 


C251T 


CR 


gggacggtcggtagatTGAGAATTTCCAAAATAG 


11470 


C251T 


TR 


gctggctcggtcaagaTGAGAATTTCCAAAATAA 


11472 


A251T 


EF 


TACATTCAAGGCAAGAAAA 


11472 


A251T 


ER 


TGATTAGTTACAATTACCTCTAGTATC 


11472 


A251T 


AF 


gggacggtcggtagatAGTTTGTCAGTAAATGTA 


11472 


A251T 


TF 


gctggctcggtcaagaAGTTTGTCAGTAAATGTT 


11487 


A485T 


EF 


gacgatgccttcagcacaAGAGAGCAGCTAGACTGAGA 


11487 


A485T 


ER 


TTCCTGCAAACAGTTGAG 


11487 


A485T 


AR 


gggacggtcggtagatAGTTGAGGGCTCAGGATT 


11487 


A485T 


TR 


gctggctcggtcaagaAGTTGAGGGCTCAGGATA 


11488 


C533G 


EF 


gacgatgccttcagcacaAGAGAGCAGCTAGACTGAGA 
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11488 


C533G 


ER 


GTAAATAAAATGGGATGGTG 


11488 


C533G 


CR 


gggacggtcggtagatGCCCCAGCAAGCTGCATG 


11488 


C533G 


GR 


gctggctcggtcaagaGCCCCAGCAAGCTGCATC 


11493 


A171G 


EF 


CCTTTTGTGTTTTGTTTTGT 


; 11493 


A171G 


ER 


CTTCTCCACCTTCCATTC 


| 11493 


A171G 


AF 


gggacggtcggtagatGGGAACTCCTAAATCAAA 


11493 


A171G 


GF 


gctggctcggtcaagaGGGAACTCCTAAATCAAG 


11502 


C455T 


EF 


gacgatgccttcagcacaACGATGGGGTCAGAGTCA 


11502 


C455T 


ER 


CCTACATTTCACACACGAACA 


11502 


C455T 


CR 


gggacggtcggtagatACACACTCCTCTCTCAAG 


11502 


C455T 


TR 


gctggctcggtcaagaACACACTCCTCTCTCAAA 


11534 


G258T 


EF 


GCCATCGTCTTTCCCT 


11534 


G258T 


ER 


TCCTCCCTCCTTCTCTCT 


11534 


G258T 


GR 


gggacggtcggtagatCCTCCACCCACCAGGGCC 


11534 


G258T 


TR 


gctggctcggtcaagaCCTCCACCCACCAGGGCA 


11537 


A251G 


EF 


CCTCTTTCTCCTCCTCTTC 


11537 


A251G 


ER 


CTCTTCCTGTCTTCTCCTCT i 


11537 


A251G 


AF 


gggacggtcggtagatAGATGGACCTCTACAGGA 


11537 


A251G 


GF 


gctggctcggtcaagaAGATGGACCTCTACAGGG 


11560 


A185G 


EF 


CTCCTCCAACTCCTTTAC 


11560 


A185G 


ER 


ATACTTCTCACTGCATCCT 


11560 


A185G 


AR 


gggacggtcggtagatCCTGTCCCCTCCCTAGTT 


11560 


A185G 


GR 


gctggctcggtcaagaCCTGTCCCCTCCCTAGTC 


11594 


C251T 


EF 


CACCTTCCTGAACTCACTC 


11594 


C251T 


ER 


TGATGTCTGTGCTGTCCT 


11594 


C251T 


CR 


gggacggtcggtagatTCTGGTCCACTCAAGGAG 


11594 


C251T 


TR 


gctggctcggtcaagaTCTGGTCCACTCAAGGAA 


11624 


C251T 


EF 


TCGGGAGGTGTAAGTAAG 


11624 


C251T 


ER 


CCACAGTCAGAAGAGACAA 


11624 


C251T 


CR 


gggacggtcggtagatAGAGACCCTGGTCCCAAG ^ 


11624 


C251T 


TR 


gctggctcggtcaagaAGAGACCCTGGTCCCAAA 


11627 


C251T 


EF 


TTTATCACTACACCCCCTACTC 


11627 


C251T 


ER 


GACAGACCGACCAATCAC 
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11627 


C251T 


CR 


gggacggtcggtagatCCCTGGGAAGGTTGAGAG 


11627 


C251T 


TR 


gctggctcggtcaagaCCCTGGGAAGGTTGAGAA 


11650 


A146G 


EF 


CTGTCTGTTTGGGTCTTC 


11650 


A146G 


ER 


CGTTGTTCTCTGTCCACT 


11650 


A146G 


AR 


gggacggtcggtagatGGCCAAATGTCTAAAAGT 


11650 


A146G 


GR 


gctggctcggtcaagaGGCCAAATGTCTAAAAGC 


11654 


A251G 


EF 


CGTATCTCTTGCCTTTCTT 


11654 


A251G 


ER 


CTTCTCTTATGCCTTCCC 


11654 


A251G 


AF 


gggacggtcggtagatTTACTTGAAAGGACACCA 


11654 


A251G 


GF 


gctggctcggtcaagaTTACTTGAAAGGACACCG 


11655 


A251C 


EF 


CGTATCTCTTGCCTTTCTT 


11655 


A251C 


ER 


CTTCTCTTATGCCTTCCC 


11655 


A251C 


AF 


gggacggtcggtagatTTCTGCACTAAAGCTGTA 


11655 


A251C 


CF 


gctggctcggtcaagaTTCTGCACTAAAGCTGTC 


11656 


C251T 


EF 


TGGGAAGAAAAAGAGAAG 


11656 


C251T 


ER 


GTTGAAACACTGCACAAG 


11656 


C251T 


CR 


gggacggtcggtagatCAGGGCTGTTGGGTGAAG 


11656 


C251T 


TR 


gctggctcggtcaagaCAGGGCTGTTGGGTGAAA 


11825 


A277G 


ER 


gacgatgccttcagcacaTGAATAGACAGGGACGAA 


11825 


A277G 


EF 


GACCTTGGAAATAATGGAG 


11825 


A277G 


AF 


gggacggtcggtagatCAACCCAGCAAAAATGGA 


11825 


A277G 


GF 


gctggctcggtcaagaCAACCCAGCAAAAATGGG 


11914 


A246T 


EF 


gacgatgccttcagcacaTTGGAAGTGAGATAAGATAGGT 


11914 


A246T 


ER 


ACGGTGAGAATGAGAGGT 


11914 


A246T 


AR 


gggacggtcggtagatAAAACAGACATCAGAGGT 


11914 


A246T 


TR 


gctggctcggtcaagaAAAACAGACATCAGAGGA 


12097 


A411G 


ER 


gacgatgccttcagcacaGATGAAACCCTGTCTCTACT 


12097 


A411G 


EF 


TTATCAACCTTAGTCTCCCT 


12097 


A411G 


AF 


gggacggtcggtagatACCTGCCACCACACCCAA 


12097 


A411G 


GF 


gctggctcggtcaagaACCTGCCACCACACCCAG 


12366 


A412G 


ER 


gacgatgccttcagcacaGCTGATGTGGTTGTGAG 


12366 


A412G 


EF 


GTTCCTGTAGCTCGTGTAG 


12366 


A412G 


AF 


gggacggtcggtagatCTCCCCGCCCTGCAGCAA 
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12366 


A412G 


GF 


gctggctcggtcaagaCTCCCCGCCCTGCAGCAG 


12619 


A25G 


ER 


gacgatgccttcagcacaTGGCTGGACTTTGACTGATA 


12619 


A25G 


EF 


TCTTGTTTGTGTCACAGTGC 


12619 


A25G 


AF 


gggacggtcggtagatTGTGTCACAGTGCTCTGA 


12619 


A25G 


GF 


gctggctcggtcaagaTGTGTCACAGTGCTCTGG 


13025 


A585C 


EF 


gacgatgccttcagcacaTTTAAGTAACATGACAAACTC 


13025 


A585C 


ER 


ATCTGATAACTGAGCAGG 


13025 


A585C 


AR 


gggacggtcggtagatCTATTAAGTAACTGGTGT 


13025 


A585C 


CR 


gctggctcggtcaagaCTATTAAGTAACTGGTGG 


13191 


A504G 


ER 


gacgatgccttcagcacaATTCTCCCATTTCTCCTGT 


13191 


A504G 


EF 


TGCCTCTTCTCCTCATTC 


13191 


A504G 


AF 


gggacggtcggtagatCCCTAATGTCTTCCTCTGA 


13191 


A504G 


GF 


gctggctcggtcaagaCCCTAATGTCTTCCTCTGG 


900045 


C116T 


EF 


ATCTCCTGATCCAAGTCC 


900045 


C116T 


ER 


CACACTGTGCCCATCTAC 


900045 


C116T 


CR 


gggacggtcggtagatCTGACTGATTACCTCATG 


900045 


C116T 


TR 


gctggctcggtcaagaCTGACTGATTACCTCATA 


900078 


A251G 


EF 


CATAGGTAAAGATCTGTAGGTG 


900078 


A251G 


ER 


CCACCTTGGAAGTTGGCAAA 


900078 


A251G 


AR 


gggacggtcggtagatattaaatcgcctctctcT 


900078 


A251G 


GR 


gctggctcggtcaagaattaaatcgcctctctcC ■ 


900107 


C426T 


ER 


gacgatgccttcagcacaAGGGCTTTTTCAGGTAGA 


900107 


C426T 


EF 


GACCTTTCCTGGGTAGAA 


900107 


C426T 


CF 


gggacggtcggtagatACTCTGAACCTGGGGGAC 


900107 


C426T 


TF 


gctggctcggtcaagaACTCTGAACCTGGGGGAT 


10000002 


A103G 


AF 


gggacggtcggtagatGATCAACACAATCTTCAA 


10000002 


A103G 


EF 


CAGCTGAAAGAGATGAAATTTACT 


10000002 


A103G 


ER 


GACGATGCCTTCAGCACAAACTTATGAAGATTAAGGCATAGG 


10000002 


A103G 


GF 


gctggctcggtcaagaGATCAACACAATCTTCAG 


10000006 


G107A 


AF 


gctggctcggtcaagaGGGCTGGGCTGCTAGGGA 


10000006 


G107A 


EF 


AGACGAGTTCAAGGTGAGTG 


10000006 


G107A 


ER 


GACGATGCCTTCAGCACACCAAGTTTCCGAGTTTCC 


10000006 


G107A 


GF 


gggacggtcggtagatGGGCTGGGCTGCTAGGGG 



-89- 



AttyDktNo. 1300-0015 
PATENT 



baySNP 


SNP 


Name 


Sequence 


10000014 


A153C 


AF 


gggacggtcggtagatGTACCAATACATCCTGCA 


10000014 


A153C 


CF 


gctggctcggtcaagaGTACCAATACATCCTGCC 


10000014 


A153C 


EF 


CTGCTGATGTCTCTGTTG 


10000014 


A153C 


ER 


GACGATGCCTTCAGCACAGACTTACTTTGCTCACACTT 


10000025 


C291T 


CF 


gggacggtcggtagatCCTCACTTCCTCAACGCC 


10000025 


C291T 


EF 


CCTCTCTGTCTGGTTATCTTG 


10000025 


C291T 


ER 


GACGATGCCTTCAGCACAAGTGTGCCTCCTGGTTAG 


10000025 


C291T 


TF 


gctggctcggtcaagaCCTCACTTCCTCAACGCT 



Table 2b 

OLIGONUCLEOTIDE PRIMERS USED FOR GENOTYPING USING PYROSEQUENCING 
[0634] The baySNP number refers to an internal numbering of the PA SNPs. Primer sequences are 
listed for preamplification of the genomic fragments and for sequencing of the SNP using the 
pyrosequencing method. Bio: Biotinylated Oligonucleotide. 



baySNP 


Name 


Sequence 


2995 


Primer F 


GCCAAGACTAGGAAGTAAGTGT 


2995 


Primer R 


Bio-CCCAGAACCACAAAGCTAGTAA 


2995 


Seq. 


TGCCCTGGTCACCTCCTTTCC 


3689 


Primer F 


BIO-CTGACCCTGACCTTCATACTCAA 


3689 


Primer R 


AGAAGAAAGAAGCCTCTCTACAGTT 


3689 


Seq. 


AACAGATCAGGTTGGTG 


4838 


Primer F 


Bio-CAAAGATGACCTTATGGCTCTGA 


4838 


Primer R 


GTCTCGGAACATGACCTTTAGT 


4838 


Seq. 


TGACTAAGAATGTAATGGGGAAGA 


6498 


Primer F 


CTTTGTGGATCTTTCTGCGGTGT 


6498 


Primer R 


Bio-CCATGTTGAGGAGCCCAGAGTGA 


6498 


Seq. 


ATTACAGTTGTGAGATTGTGC 


8021 


Primer F 


GGCCTTCTATGTACTAGGCG 


8021 


Primer R 


Bio-CTCTTTCTGGAGGCATCAATC 


8021 


Seq. 


CACAGGGAGACCCC 


8060 


Primer F 


Bio-GCCTTATTTTCCACTCCCACCT 


8060 


Primer R 


TACCTTTCCCCATCCCAACTG 
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8060 


Seq. 


TCAGCATATGTTTGGATT 


8846 


Primer F 

X 11111^1 J. 


ATTTGAGAGAAGGTAGGGT 

ix 111 1 /VJvJVJ 1 


8846 


Primer R 


BIO-TTTGTTACTCTGTAGCCA 


8846 




A AATATTCAGTAACTTGTTT 


9849 


Primer F 


AAG CAG CAA TCG AAT CCC TT 


9849 


Primer R 


TGT TGT TGT TTG GCT AGC TCC 


9849 


Seq. 


CCT GCC TTA CTG AGA GCC AAA 


10079 


Primer F 


Bio-CACGCCAATTCCCACCATCCT 


10079 


Primer R 


GTCCGTCGAGGGGGAATGTGTTT 


10079 


Seq. 


AATGTGTTTCTTGGGGGT 


10747 


Primer F 


CTAACCATCTTCCAAATGCTTAATC 


10747 


Primer R 


BIO-TCCTTGAGTCTGAGTTTCCC 


10747 


Seq. 


CACAAGAAACCCTGAAA 


11578 


Primer F 


CTC GGC GTG CTT GGT AAT AA 


11578 


Primer R 


CGG AGC CGA ACT CTG GAG GAA TCT 


11578 


Seq. 


GGC TGG CAA GTT GTT CCA TCC CAC 


! 11644 


Primer F 


TGA GCA GCG CAT CCT 


11644 


Primer R 


TGC AGC CCA CTG ACT CAA 


11644 


Seq. 


GCT GTT ACT CAG TAT GAT 


12008 


Primer F 


CCGAAGACCAAGACGC 


12008 


Primer R 


Bio-TCTTCCATAAAAACAAGGCTC 


12008 


Seq. 


AAACAAGAAATTCTGTTTA 


13937 


Primer F 


TGA CAG CTC CCA TTG GAA 


13937 


Primer R 


AAT TAA TGC GAT CCC TC 


13937 


Seq. 


GAC AGC TCC CAT TGG AAG 


900002 


Primer F 


ATTGGGCAGGGATAAGAGAAAAG 


900002 


Primer R 


Bio-GATGAATCACAGAATGCGGTAT 


900002 


Seq. 


CACACAGCAGTTCACGCA 


900013 


Primer F 


GCCAAGACTAGGAAGTAAGTGT 


900013 


Primer R 


Bio- CCCAGAACCACAAAGCTAGTAA 


900013 


Seq. 


TGCCCTGGTCACCTCCTTTCC 


900025 


Primer F 


Bio-AGTGGCTCACTTGCTAACG 


900025 


Primer R 


CTGGGGAAGAAAATAAATGAA 
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900025 


Seq. 


CTTGCTCTTAGGATACACGT 


900032 


Primer F 


AGCGTCTTCACCATCTGCT 


900032 


Primer R 


Bio-GGGAAGGAGGAAGCCAAACA 


900032 


Seq. 


ACATGTCTGATGATACCTGG 


900045 


Primer F 


BIO-GCCATGCACGATTTCCC 


900045 


Primer R 


CACTGTGCCCATCTACGAG 


900045 


Seq. 


GGACCTGACTGATTACCT 


900065 


Primer F 


GAGTAGCTAGGATCACAGGTGCGT 


900065 


Primer R 


BIO-TGTTCGAGATTTAAGAAAGTTGGC 


900065 


Seq. 


CAGGTGCGTGCCACCATGCCC 


900082 


Primer F 


CAC ACA ATT TTC CAC TTA 


900082 


Primer R 


GAC TCC AGT TTT CTA TCA 


900082 


Seq. 


ATG TTG ATG TAA TCT ACT 


900096 


Primer F 


TGGGGCAAGCAACAGTGGT 


900096 


Primer R 


Bio-TAGGCAGGGCAAGGGATTAGG 


900096 


Seq. 


TTTAAATTCTCTGACAGAGAC 


900107 


Primer F 


BIO-GCCACCAGCCCACACTCTGAACCTG 


900107 


Primer R 


CCATCAGCCTTCACCCACGTGCCA 


900107 


Seq. 


GCCTCAGCTTGACCT 


900115 


Primer F 


B io-GGTAAGTGCGTGCCTGGG AG ATGC 


900115 


Primer R 


CGGGGTGGGGAGGACAGAGC 


900115 


Seq. 


GAGGACAGAGCAAAAGGAT 


900121 


Primer F 


Bio-TGCCTTACAATATACAATGG 


900121 


Primer R 


CAATGGGTAAGGAGTAAAGTT 


900121 


Seq. 


TTCCAGCTGCTTTTA 



Table 2c 

oligonucleotide primers used for genotyping using 
Restriction Fragment Length Polymorphism (RFLP) 
[0635] The baySNP number refers to an internal numbering of the PA SNPs. Primer sequences are 
listed for preamplification of the genomic fragments. The restriction enzyme used for RFPL is 
indicated. 
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Enzyme 


900173 


Primer F 


GAACAAACCTCCGAGATGCTAC 


Hind in 


900173 


Primer R 


GTCTTATGTTACTGGGCTTTCACC 


Hind in 
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DESCRIPTION 


Human coagulation factor V mRNA, complete cds. 


Homo sapiens ITGB3 gene, intron 2, fragment C, partial sequence. 


H. sapiens hexokinase I (MK-16) 


H.sapiens mRNA for NEFA protein 


Human nerve growth factor receptor mRNA, complete cds. 


ACTIN, ALPHA SKELETAL MUSCLE (ALPHA-ACTIN 1). 


H.sapiens mRNA for tre oncogene (clone 210) 


Human thermostable phenol sulfotransferase (STP2) gene, partial cds. 


Human ADP-ribosylation factor 4 (ARF4) mRNA, complete cds. 


Homo sapiens MSH55 gene, partial cds; and CLICl, DDAH, G6b, G6c, 
G5b, G6d, G6e, G6f, BAT5, G5b, CSK2B, BAT4, G4, Apo M, BAT3, 
BAT2, AIF-1, 1C7, LST-1, LTB, TNF, and LTA genes, complete cds. 


Homo sapiens (clones lambda gMHC 1,2,3, and 4) beta-myosin heavy chain 
(MYH7) gene, complete cds. 


Human flavin-containing monooxygenase (FMOl) mRNA, complete cds. 


Human flavin-containing monooxygenase (FMOl) mRNA, complete cds. 


Homo sapiens methionine adenosyltransferase alpha subunit gene fragment. 


H.sapiens YAP65 mRNA 


H.sapiens YAP65 mRNA 


Human coagulation factor VII (F7) gene exon 1 and factor X (F10) gene, 
exon 1. 


Human apolipoprotein A-IV gene, complete cds. 
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DESCRIPTION 


Homo sapiens laminin-related protein (LamA3) mRNA, complete cds. 


H.sapiens ALK-3 mRNA. 


H.sapiens ALK-3 mRNA. 


Homo sapiens mRNA for platelet glycoprotein VI, complete cds. 


Homo sapiens mRNA for platelet glycoprotein VI, complete cds. 


Human gene for thrombopoietin, exonl-exon6, complete cds. 


Homo sapiens PAC clone RP1-102K2 from 22ql2.1-qter, complete 
sequence. 


H.sapiens mRNA for hepatocyte nuclear factor 4c 


H.sapiens gene for beta-nerve growth factor (beta-NGF) 


Human phenylethanolamine N-methyltransferase gene, complete cds. 


PERIOD CIRCADIAN PROTEIN 2 (KIAA0347). 


PROBABLE G PROTEIN-COUPLED RECEPTOR APJ. 


Homo sapiens kidney and cardiac voltage dependent K+ channel (KvLQTl) 
mRNA, complete cds. 


Homo sapiens DNA sequence from PAC 329E20 on chromosome 1 p34.4- 
36.13. Contains endothelin-converting-enzyme 1 (ECE-1), EST, STS, CA 
repeat 


ABCCl : ATP-binding cassette, sub-family C (CFTR/MRP), member 1 


ABCCl : ATP-binding cassette, sub-family C (CFTR/MRP), member 1 


ABCB4: ATP-binding cassette, sub-family B (MDR/TAP), member 4 


Human protein tyrosine phosphatase (PTPase-alpha) mRNA. 
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WE CLAIM: 

1. An isolated polynucleotide encoded by a phenotype associated (PA) gene; the polynucleotide 
is selected from the group comprising 

SEQ ID 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 
27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 
52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 
77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 
101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 
119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 
137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 
155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168 with allelic variation as 
indicated in the sequences section contained in a functional surrounding like full length 
cDNA for PA gene polypeptide and with or without the PA gene promoter sequence. 

2. An expression vector containing one or more of the polynucleotides of claim 1 . 

3. A host cell containing the expression vector of claim 2. 

4. A substantially purified PA gene polypeptide encoded by a polynucleotide of claim 1 . 

5. A method for producing a PA gene polypeptide, wherein the method comprises the following 
steps: 

a) culturing the host cell of claim 3 under conditions suitable for the expression of the 
PA gene polypeptide; and 

b) recovering the PA gene polypeptide from the host cell culture. 

6. A method for the detection of a polynucleotide of claim 1 or a PA gene polypeptide of claim 
4 comprising the steps of: 

contacting a biological sample with a reagent which specifically interacts with the 
polynucleotide or the PA gene polypeptide. 

7. A method of screening for agents which regulate the activity of a PA gene comprising the 
steps of: 
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contacting a test compound with a PA gene polypeptide encoded by any polynucleotide of 
claim 1; and detecting PA gene activity of the polypeptide, wherein a test compound which 
increases the PA gene polypeptide activity is identified as a potential therapeutic agent for 
increasing the activity of the PA gene polypeptide and wherein a test compound which 
decreases the PA activity of the polypeptide is identified as a potential therapeutic agent for 
decreasing the activity of the PA gene polypeptide. 

8. A reagent that modulates the activity of a PA polypeptide or a polynucleotide wherein said 
reagent is identified by the method of the claim 7. 

9. A pharmaceutical composition, comprising: 

the expression vector of claim 2 or the reagent of claim 8 and a pharmaceutical ly acceptable 
carrier. 

10. Use of the reagent according to claim 8 for the preparation of a medicament. 

11. A method for determining whether a human subject has, or is at risk of developing a 
cardiovascular disease, comprising determining the identity of nucleotide variations as 
indicated in the sequences section of SEQ ID 1-168 of the PA gene locus of the subject and 
where the SNP class of the SNP is "CVD" as can be seen from table 3; whereas a "risk" 
genotype has a risk ratio of greater than 1 as can be seen from table 6. 

12. A method for determining a patient's individual response to statin therapy, including drug 
efficacy and adverse drug reactions, comprising determining the identity of nucleotide 
variations as indicated in the sequences section of SEQ ID 1-168 of the PA gene locus of the 
subject and where the SNP class of the SNP is "ADR", "EFF" or both as can be seen from 
table 3; whereas the probability for such response can be seen from table 6. 

13. Use of the method according to claim 12 for the preparation of a medicament tailored to suit 
a patient's individual response to statin therapy. 

14. A kit for assessing cardiovascular status or statin response, said kit comprising 

a) sequence determination primers and 

b) sequence determination reagents 
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wherein said primers are selected from the group comprising primers that hybridize to 
polymorphic positions in human PA genes according to claim 1; and primers that 
hybridize immediately adjacent to polymorphic positions in human PA genes according 
to claim 1 . 

A kit as defined in claim 12 detecting a combination of two or more, up to all, polymorphic 
sites selected from the groups of sequences as defined in claim 1. 

A kit for assessing cardiovascular status or statin response, said kit comprising one or more 
antibodies specific for a polymorphic position defined in claim 1 within the human PA gene 
polypeptides and combinations of any of the foregoing. 
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Abstract of the Disclosure 



Provided are diagnostic methods and kits including oligo and/or polynucleotides or 
derivatives, including as well antibodies determining whether a human subject is at risk of getting 
adverse drug reaction after statin therapy or whether the human subject is a high or low responder or 
a good a or bad metabolizer of statins. The diagnostic methods and kits including antibodies 
determining whether a human subject is at risk for a cardiovascular disease. Also provided are 
polymorphic sequences and other genes and isolated polynucleotides encoding a phenotype 
associated (PA) gene polypeptide useful in methods to identify therapeutic agents and useful for 
preparation of a medicament to treat cardiovascular disease or influence drug response. 
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